6 Free skies for exterior architectural renders (JPG)
Even with the existence of multiple options regarding skies and backgrounds in Blender like the procedural Sky or Blackbody node, you might find it easier to insert a photo-based reference to get the desired mood. For instance, you may want to render a project with a warm afternoon look. Instead of trying to match the […]The post 6 Free skies for exterior architectural renders (JPG) appeared first on Blender 3D Architect.
A couple of months ago, we shared an impressive project here in Blender 3D Architect from a developer called Bruno Postle, who worked in a framework to quickly create buildings in Blender.. The tool works with simple primitive shapes. From those models, it can create all the necessary structures. In November, the tool was in […]The post Blender Homemaker-Topologise updates appeared first on Blender 3D Architect.
Bed and Bathroom: Free architectural interior scene
Last week we shared an excellent resource for artists trying to develop their skills in architectural visualization, which consists of an example scene created with Blender 2.9x. You can download and inspect multiple aspects of the project to learn how the artist solved the illumination of such a project. Do you know that we have […]The post Bed and Bathroom: Free architectural interior scene appeared first on Blender 3D Architect.
When you find a visualization project that shows some impressive results, you probably wonder about the process used by the artist to achieve that level of quality. For that reason, we always recommend to our readers to watch and read all material related to the development of a project. It is a powerful learning resource […]The post Making of Al fia Apartment with Cycles appeared first on Blender 3D Architect.
The majority of projects using Blender for architecture have a direct relation to small residential spaces. One of the reasons for that is because it is by far easy to develop a project with lots of existing references. That is also a great way of starting with architectural interiors because you have an abundance of […]The post Nu Villa with Blender Cycles appeared first on Blender 3D Architect.
The easiest way of adding a human scale or reference to any 3D project in Blender is using cutout textures of people, which also applies to vegetation and cars if you have the right textures. Most of those textures consider the point of view from the observer at ground level. That works great for most […]The post 8 Free cutout textures of people (Bird’s Eye) appeared first on Blender 3D Architect.
When you have a project that needs maximum realism in a render, a great way of making it look better is with a couple of effects. A popular choice for rendering interiors is a Depth of Field effect, which focuses the render on a single object and applies a blur to the rest of your […]The post New DoF options for Eevee appeared first on Blender 3D Architect.
Blender textures Nodes: Randomly placing a texture across different tiles
The use of tiles to place textures in any project is a huge timesaver for architectural designs because it allows us to cover more surfaces with small textures. But, if you try to use tiles with a repeating pattern on a surface, you immediately lose the realism of any render. Regardless of you can achieve […]The post Blender textures Nodes: Randomly placing a texture across different tiles appeared first on Blender 3D Architect.
For those of you willing to find tutorials and premium content focusing on architectural visualization artists professionals using Blender, we are proud to present Blender 3D Architect Pro. What is Blender 3D Architect Pro? It is a special section of Blender 3D Architect where we deliver content to develop your skills with Blender even further. […]The post Blender 3D Architect Pro updates 2021011 appeared first on Blender 3D Architect.
A great way to learn is to open an example file of an existing project and take a peek inside to see how the author solved many potential problems. You can look at the 3d models and see how materials and other aspects of the file work. It is an invaluable resource. That is one […]The post OSArch example files repository appeared first on Blender 3D Architect.
A couple of weeks ago, we posted an article about the development of a useful library that could help any artist to display an entire project in the wen easily. With the IFC.js, you can read IFC files and turn them into a format that is browser friendly. The IFC format is the standard to […]The post IFC.js shows a demo of a BIM project in a browser appeared first on Blender 3D Architect.
How to convert MAX files to FBX or OBJ without 3ds Max?
Using a workflow with open-source tools and file standards ensures that you always have access to your data regardless of the software or environment. If you create a project and save the file in universal formats, you can easily import and send that data to any software. However, it doesn t ensure that you won t receive […]The post How to convert MAX files to FBX or OBJ without 3ds Max? appeared first on Blender 3D Architect.
The project profile series of Blender 3D Architect is a collection of articles aiming for feature projects related to architectural visualization. We invite talented artists to share additional details about each project to demonstrate how they approach each stage. And also allow each author to publicize their work among our readers. How does it work? […]The post Office exterior with Blender Cycles (Profile) appeared first on Blender 3D Architect.
For those of you willing to find tutorials and premium content focusing on architectural visualization artists professionals using Blender, we are proud to present Blender 3D Architect Pro. What is Blender 3D Architect Pro? It is a special section of Blender 3D Architect where we deliver content to develop your skills with Blender even further. […]The post Blender 3D Architect Pro updates – 2020122 appeared first on Blender 3D Architect.
If you missed any of our articles from last week in Blender 3D Architect, you now have the chance to view a summary of all the content we posted. Among the materials, you will always find content related to architecture, furniture models, and Blender news. Here is a list of articles from last week: How […]The post Last week in Blender 3D Architect 2020: Week 52 appeared first on Blender 3D Architect.
If you recall from a past article here in Blender 3D Architect, an Add-on called VI-Suite for Blender can turn the software into an incredible design tool for energy-efficient projects. You can run environmental analysis and all types of simulations related to energy. Do you want to learn how to install VI-Suite to Blender? A […]The post How to install VI-Suite 0.6 in Blender? appeared first on Blender 3D Architect.
How to download content from Blender 3D Architect Pro?
Earlier this month, we started a new project here in Blender 3D Architect to migrate all our content based on subscriptions to Gumroad. There we gave it the name of Blender 3D Architect Pro. There we have a reliable system to keep those files and benefits like a mobile app that you can use to […]The post How to download content from Blender 3D Architect Pro? appeared first on Blender 3D Architect.
Bathroom renovation with Cycles in Bern, Switzerland (Profile)
The project profile series of Blender 3D Architect is a collection of articles aiming for feature projects related to architectural visualization. We invite talented artists to share additional details about each project to demonstrate how they approach each stage. And also allow each author to publicize their work among our readers. How does it work? […]The post Bathroom renovation with Cycles in Bern, Switzerland (Profile) appeared first on Blender 3D Architect.
For those of you willing to find tutorials and premium content focusing on architectural visualization artists professionals using Blender, we are proud to present Blender 3D Architect Pro. What is Blender 3D Architect Pro? It is a special section of Blender 3D Architect where we deliver content to develop your skills with Blender even further. […]The post Blender 3D Architect Pro updates (2020121) appeared first on Blender 3D Architect.
Learn Blender & Unreal Engine 4 - A Roomscale VR Training Series
CBaileyFilm writes: The Bounty Hunter's Den is an 11 part tutorial series with over 6 and 1/2 hours of mind-bending awesomeness. When you buy the series on the Blender Market you get the tutorials ad-free plus all of the project files and assets used in the tutorial series, including the final Unreal Engine 4 project [...]Source
yanourt writes: Hi ! A short personnal project, to continue to discover Blender, second time I use it. I used Houdini for water simulation, wall modeling. Hope you will enjoy!Source
In this free Blender tutorial from our upcoming new lighting course let’s take a look at animating the lights themselves. There are a lot more ways to animate lighting that first meets the eye, using a simple subway scene lets shine a light on the subject! 2 different ways forThe post Animating Lights in Blender appeared first on Creative Shrimp.
Discover 11 tips for procedural shading in Blender 2.9 with Luca Rood. A new Blender tutorial for procedural texturing fans (nodevember-style). Download the Project Files Luca on Twitter Creative Shrimp Our courses on GumroadThe post 11 Procedural Shading Tips in Under 10 Minutes appeared first on Creative Shrimp.
Discover how free online tools can help in glitching your art. Some of these glitch apps are really amazing! https://photomosh.com/ – A free image glitching web apphttps://snorpey.github.io/jpg-glitch/ – Jpg Glitch (Image Glitch Tool) Music: Asterism by Underbelly – Youtube free Audio LibraryThe post Glitch Effect Tutorial (Using 2 Amazing Web Apps) appeared first on Creative Shrimp.
In this quick tip for Blender, Manu J rvinen shows how to use the Gradient node in the environment material to enhance the chrome reflections. Download the Project Files A tutorial from the upcoming Creative Shrimp course, Manu-facturing Stylized 3D Art. Stay tuned! Manu J rvinen’s website ArtstationThe post Quick Tip: Gradient Node for Chrome Reflections appeared first on Creative Shrimp.
Different modeling engines (voxel, Nurbs, distance field) have different pros and cons. Blender uses a polygonal approach, it’s an amazingly effective approach for both precision and speed, BUT! There are some drawbacks, let’s banish those demonic polygonal forces once and for all! Get the full 50 Modeling Issues from Hell course on Blender Market GetThe post Extrusion Confusion | Blender Modeling Tutorial appeared first on Creative Shrimp.
All-quad Topology Can Be Bad? | Blender Modeling Tutorial
As a rule of thumb, having the topology that is made of quads is desirable, but there are cases in 3d modeling when it can betray you. In this Blender modeling tutorial we’ll talk about this issue and show an easy fix (hint: Decimate Planar). Full course on Blendermarket Full course on GumroadThe post All-quad Topology Can Be Bad? | Blender Modeling Tutorial appeared first on Creative Shrimp.
If you like to paint magic powers, this new tutorial is for you!Youtube: https://youtu.be/vopP_8kJMj8Peertube: https://peertube.touhoppai.moe/videos/watch/0aae9907-b608-4dd8-908a-d32a88cc0363It mainly introduces the amazing "Crease" GMIC filter but also shows how to get a dynamic 'out glow' with the layer effect of Krita.Demo base picture for you to practice: https://www.peppercarrot.com/extras/resources/2021-03-18_Turbulence-magic-effect_demo_CCBY-DavidRevoy.jpgSubtitle available.Timeline:00:00 Intro00:19 Demo setup01:00 Start GMIC-Qt for Crease.02:20 Layer effect for the outer glow.03:40 Paint-over04:59 OutroLicense: Creative Commons Attribution 4.0 InternationalVideo and artworks by David Revoywww.davidrevoy.comSoundtrack:Intro: Fabian Measures - Hanami (CC-By), www.soundcloud.com/fogheartTimelapse: Kevin MacLeod - Perspective (CC-By), www.incompetech.comOutro: Kevin MacLeod - Backed Vibes Clean (CC-By), www.incompetech.comEdited with Kdenlive 20.12 on Kubuntu Linux 20.04
Next-week, I ll be speaking at the online #LibrePlanet 2021software freedom conference about the problem of fan arts. It will be epic, check it out! Sunday 21 March, 10:10 to 10:55 EDT (15h10 Paris time)https://libreplanet.org/2021/
Derivation: Peppertown video-game by Congusbongus and StarNavigator
Gif animation screencapture while starting a gamePeppertown is a Pepper&Carrot RPG-themed idle game by Congusbongus and StarNavigator. You can play it directly on its itch.ico dedicated page [1] without installing anything: it plays directly in your browser.To play the game go to: https://congusbongus.itch.io/peppertown.The game:Here is my description of it: A sweet town gets attacked by 2000 Slimes! Pepper, Saffron, Shichimi and Coriander decide to protect it and start to fight back; and each success brings a little money, probably paid by the city to reward their hard work... But the girls are quickly exhausted after one or two fights and need to restore their health near the wall of the town. So, they all do a lot of walk between the building and the battlefield... At this low rythm, it would take years to remove the 2000 slimes!Your mission is if you accept to manage the money earned by the crew and spent it. You are a sort of strategy manager of the battle, and you can click on a shop (four are available) and decide where to spend an upgrade: something to improve their speed? a better attack level? a faster recovery? a spell to teleport? Many options are possible. The order you'll spend the money will define your strategy to win the battle. It gets funny as soon as you start to get a little overpowered crew cleaning the monitor.Free/Libre and open-source:Thanks to the authors because the game is fully open-source and released on Github under the MIT License [2]. It was made with FLOSS tools (GIMP, VS Code, Phaser, Audacity, git, Tiled) for the MiniJam22 contest [3] and congratz to Congusbongus and StarNavigator for reaching the 2nd place with Peppertown! Ideas for the future:I don't know if this game will continue to receive update or development, but I wish! I really liked to see the sprite-sheet of DiamondDMGirl [4] in action, they work really well. So here is a quick list of ideas from my perspective: I wish the slime were not 'killed/destroyed', but captured or just kicked far away. I also wish the shop had other Pepper&Carrot characters; older witch, and the item be potions. Maybe before starting a game three screens could tell the story instead of a single one? Also, I'm not sure how one can loose a game? Maybe a countdown or a way slimes could start destroying the town? Just ideas. Thanks again to the authors for sharing it!Links:[1] Play the game on itch.io[2] Sources on Github[3] MiniJam22 results[4] The CC-By Pepper&Carrot sprites sheet made by DiamondDMGirl
Check Your Painting Values with One Krita Shortcut
New video tutorial: how to set up one Keyboard shortcut to check your painting values in #krita while painting? Here is a short (2min) quick tip to answer this :-)Soon also on Peertube mirror managed by Touhoppai team: https://peertube.touhoppai.moe/accounts/shichimi/videos (thanks!)Timeline:00:00 Intro and description.00:15 Demonstration of how to use the shortcut of 'Softproofing', with Ctrl+Y, it allows one to preview the canvas into another colorspace.00:43 I teach you here how to switch the document properties from the default CMYK "Chemical Proof" model in Absolute mode to the Greyscale model in perceptual mode. 01:38 In the end of the video, I show you how to apply this change permanently to your next new documents.02:04 Outro and credits.
How Pixar s Movement Animation Became So Realistic
Pixar gets its characters to move and emote by building them rigs and filling them with controls that allow animators to give them unique expressions and movements. “Toy Story 2” gave them the ability to adapt and reuse rigs for multiple characters, allowing a wide array of characters of all shapes and sizes. In “Finding […]The post How Pixar’s Movement Animation Became So Realistic appeared one day on IAMAG Inspiration.
IAMAG Inspiration : Art from the Best Artists
GameDev
This is 2021: what's coming in free/libre software
There are many reasons to look back at 2020 with a whole range of emotions but I would rather look forward. Graphics The GIMP team is now focusing on completing version 3.0. It’s too early to say if it will be out in 2021. But unless everything changes dramatically, we’ll see several more 2.99.x releases at the very least.Feature-wise, not much new is planned, although there’s a new Paint Select tool being worked on by Thomas Manni. And before you ask: no, it’s not AI-based. It’s also designed to do quick binary selection for now, so not very good for selecting strands of hair and suchlike. Admittedly, I avoid talking about Glimpse and Glimpse-NX much. I have two major reasons for that.First, every public conversation about the fork of GIMP ends up in someone attacking either the Glimpse or the GIMP team, and that’s unproductive and tiresome. And then the progress isn’t all that interesting so far.The fork is just rebranding and no new features or UX fixes (unless removing the bell pepper brush is your idea of finally making it right for everyone), and then Glimpse-NX at least for the public eye exists only as UI mockups. They did get Bilal Elmoussaoui (GNOME team) to create Rust bindings to GEGL for them last autumn, but that’s all as far as I can tell.So the current pace of the project is not very impressive (again, as a GIMP contributor, I’m biased) and I’m not sure how much we are going to see in 2021.That said, I think having a whole new image editor based on GEGL would be lovely. I don’t see why Glimpse-NX couldn’t be that project. A proof-of-concept application that would load an image, apply a filter, and export it back sounds feasible. It’s something one could iterate upon. So maybe that’s how they are going to play it.The fine folks over at Krita posted a 2020 report where they listed major challenges they will be facing this year: the completion of resources management rewrite that currently blocks v5.0 release, the port to Apple M1, the launching of a new development fund (akin to that of Blender), and more.They also have just released version 4.4.2 with mesh gradients, mesh transform tool, new gradient editor etc. I think it’s safe to say that we might see MyPaint 2.0.2 later this year with some new features and quite a few bugfixes. There haven’t been much groundbreaking development since 2.0.1 released in May 2020.On the other hand, there are nice new features available in GitHub forks and not all of them have been turned into pull requests. But some were. E.g. there’s a guy who added a perspective mode with vanishing points and all, and it’s almost ready, although it’s been in the works for almost 5 years now.The topic of fullscreen color management implementation in Wayland is back, and it’s a kinda frustrating story. In a nutshell:people who are now working on this (Collabora developers) seem to have little experience with color management but they appear to be motivated to hack on the code;all the while people who have a crapload of experience with color management have had bad experience discussing this before, do not like the approach by the new team, and don’t seem excited to contribute to this new effort (Graeme’s spec proposal is still available).So we might end up with an implementation that is not suitable for professional work. At this point, there’s no telling what will happen. Personally, I keep an open mind about it, but the quality of conversations is not good in a way that there is no visible action in response to substantiated criticism. And so the conversation isn’t getting anyone any further.Now that Inkscape 1.1 alpha is out for everyone to take for a spin, I’m positive we are not so far from the final release. There is a lot packed into the coming update, see the draft of the release notes. I’m really looking forward to whatever they have planned for when 1.1 is out, there is still so much to do! Publishing There will be another 1.5.x release of Scribus that has some new PDF importing features and includes a lot of small but refining UI changes, likely before April. Another release should follow later in the year.They definitely don’t want major UI changes in the main development branch anymore until they release 1.6, and that sounds like wrapping up for the big release to me. Which means it’s only a matter of how many under-the-hood changes still need to happen. It’s been a long time since I last looked at Sigil, mostly because I lost my Nook device some years ago and never got to ordering a replacement. Meanwhile, Sigil got better. Like, really better.They came up with what seems to be a working solution for dealing with both EPUB2 and EPUB3 in the same application, they updated the UI just enough to be recognizable yet somehow nicer, and then they added some boring yet useful utility features.I do like where Kevin Hendricks and Doug Massay are taking the project to. Version 1.5.0 will probably be out in the first half of 2021. Photography Thanks to Aur lien Pierre, the last few releases of darktable introduced a more articulated division between scene-referred and display-referred workflows, and it looks like more people are contributing to that effort now. There are also some interesting things going on in pull requests:a new color balance RGB filter working in JzAzBz color space, with built-in gamut checking;a new diffusion module to add or remove lens blur, hazing, blooming etc.;a new chromatic aberrations correction module that sits in the right place of the modules chain and produces less halos than defringe;there’s also a work-in-progress option to correct lens distortions straight from the RAW Exif for camera vendors that write the distortion in there I think I mentioned that one in one of the last weekly recaps.Something you can already play with if you build from Git:a new demosaicing method, Ratio Corrected Demosaicing (based on this research), is aimed at reducing pixel overshooting;color calibration module can now create an ad-hoc internal color profile from a color checker.Once again, the amount of contributions is getting large enough for maybe more than one major release a year.Meanwhile, there are currently 63 bugs that need to be fixed for RawTherapee 5.9 to be released. I can’t give you a ETA for it and I’m not sure anybody can. The project is very active, and if you ever came across its fans online, you probably admired how strongly they feel about RT being superior in processing quality and ease of use as compared to some other tools :) The Siril team is planning to release version 0.99.8 soon and then hopefully v1.0. They’ve completely rewritten image conversion, revamped memory management, and added astrometry support (6 catalogs supported currently). Animation The Synfig team is completing the migration from autotools to CMake, and their next step would be to make it possible to build the program with MS Visual Studio. The expectation is that more contributors would join then thanks to a lower threshold. After that, the rendering code really needs attention.In 2020, VGC by Boris Dalstein was “pre-incubated” at the BIC de Montpellier startup incubator and additionally obtained a public grant from CNC-RIAM. This means Boris will be able to hire another developer to work on the project.The situation with enve is… Well, complicated, I guess. In summer 2020, Maurycy Liebner announced that the development is on hold due to health issues. In December, he started pushing some commits to the public repository on GitHub again, not at the old rate though. So it’s unclear if he’s resuming the work.I think it would help to have some sort of a framework for more people to contribute, even basic things like a roadmap and info for newly arrived developers. VFX Natron is one of the projects that I find hard to speculate about. The only active developer right now is Ole-Andr Rodlie, and his most recent work currently lives in his own GitHub fork. He says he’ll soon start making pull requests to the upstream GitHub project (which, let’s be frank, is a bit like sending PRs to yourself).There could be a few visual tweaks coming next but unlikely at the scale of a UI redesign that is being discussed on Pixls since May 2020.<img src="https://librearts.org/2021/01/foss-in-2021-preview/natron-mockup.webp" loading="lazy" alt="Natron UI mockup"/> However, the new website for the project, designed by the same contributor, has good chances to go live soon. How much that will help attracting new developers is hard to say. Probably not much, as projects like Natron really need full-time involvement (which I discussed at length in 2018, and not much changed since then).This is a rather sad turn of events, as it leaves us Blender as the only working free/libre node-based compositing solution (unless you count upcoming Olive 0.2 in). Ramen is long gone, and ButtleOFX developers gave up around 2017. And while I love Blender to pieces, the idea that it is the only option is somehow worrying. 3D Everything Nodes, new asset browser, Vulkan support in EEVEE, new character animation tools and more all work planned for Blender in 2021 sounds great. There isn’t really much to add here, apart from mentioning that some of these things will be possible thanks to corporate funding.Version 2.92 is expected in late February, with features like easily drawing 3D primitives, a whole bunch of sculpting improvements, tracing image sequences to Grease Pencil strokes, cryptomatte support in EEVEE, and so much more. I’d be a damn fool if I left out Dust3D. Jeremy HU did a lot of experimenting in 2020, and this year, his plan is to merge a lot of that into the main program. He’ll do more experimenting with his quad remesher first though.One major roadblock is writing his own code instead of CGAL libraries that are license-incompatible with Dust3D (GPL in CGAL vs MIT in Dust3D).But yes, you should generally be looking forward to a new version of the program with better UI, great quads mesh generation, easy creature animation generation, and without complex third party dependencies. Lubos Lenco is working towards releasing ArmorPaint 1.0 in 2021. He’s also planning an iPadOS release and improved Android builds. Apart from that, the usual stuff fixing bugs, adding requested features. CAD There is never a release schedule for FreeCAD but everything is pointing at v0.19 release some time in 2021. In fact, Yorik van Havre talked about that in his latest project update. The release notes draft page is a huge beast to tame. Essentially, you are looking at:tons of improvements in pretty much all workbenches, especially Arch, BIM, and TechDraw;add-on management;various UI updates;lots of bugfixes.Which is great as people do all sorts of wonderful tinkery things with FreeCAD. BlenderBIM is one of the most exciting projects lately. Dion Moult currently has two major tasks to accomplish:Improving drawings generation to bring them on par with the quality and complexity that existing mature proprietary CAD and BIM applications provide.Add incremental editing to vastly decrease the amount of roundtripping errors, some of them being the result of Blender’s data model not even designed to support 100% of the IFC specification. Dion is also extremely active in the IfcOpenShell project that is critical for BlenderBIM. As a matter of fact, he has already made almost as many commits to the source code repository as Thomas Krijnen since its inception ca. 10 years ago.Speaking of which, if you are into AEC, OSarch.org now has a blog.The OpenSCAD team seems to be planning a new release, the latest release candidate was made available just a week ago.Unfortunately it doesn’t look like LibreCAD v3 is going to happen this year. The project is still seeing some action but just not at the scale to sing ‘release time’ in an angel voice.I do expect Andrew Mustun to keep producing QCAD Community Edition releases though. Whether we like it or not, his approach to licensing appears to be working. At the very least, he’s in a position where he can hack on it on a regular basis.And to quit this chapter on a really positive note, SolveSpace v3.0 is just around the corner. They sent out a call for translations update several days ago, so it’s really happening! Moreover, Reini Urban recently started working on replacing libdxfrw with LibreDWG for DWG support in SolveSpace. This is going to be really interesting as LibreDWG is now mature enough, and as of the latest release, its API allows constructing DWG data inside CAD applications (not just export). Video The Kdenlive team is likely to participate at GSoC this year, so probably expect more interesting things to happen. They will also follow MLT development, and beyond upcoming v7, the TODO list now mentions 10-bit and HDR support (in the ‘Future’ section).Olive development is going to be interesting. I’d say expect v0.2 with all cutting tools working and maybe get used to the idea of using nightly builds for color grading tools (roughly the plan for v0.3). Matt recently published a video (see below) that he cut entirely with Olive from the main development branch, so it’s the real deal. Blender VSE is already undergoing a major revamp. The focus is on several key areas: performance, better tools and UX, better media management, better i/o workflow (think clip preview and in/out marks for 3-point editing). You can get a better idea by visiting this overview task page.One major nitpick I have there is that the plans do not seem to directly mention the handling of 10-bit data. Which, after closing the relevant bug report in 2013 as invalid (sic!), seems odd.I don’t really track other projects all that actively. What I can say about Pitivi though is that you shouldn’t judge the activity of its developers by the commit log. All the really interesting stuff is happening in the Merge Requests section. Music-making The development of Ardour 7.0 is going strong but we might still see another 6.x release with fixes and small improvements before the major update happens. Please do note that there are no actual public claims v7.0 is going to be released this year though. Zrythm is getting really close to v1.0 release. It’s still at the stage where users are, like, OMG, I think I can complete a song with this! But look, every other DAW started there. And the only other EDM-centered free/libre tool on Linux is LMMS. Which is well on its way to 1.3.0 release later this year, by the way. I like a lot what’s going on with MuseScore too. They’ve just released v3.6 with score rendering improvements and a whole new font for score engraving. The path to v4.0 is not an easy one, they are pretty much rewriting the application because it was difficult to expand the old code base. So please don’t make me predict the ETA of that major update! The Mixxx team is working hard on getting the 2.3.0 release out (they moved to the beta stage in June last year). The new version is featuring deck cloning, hotcues and track color improvements, importing cue points, track colors, and playlists from Serato, and more (see here for more info). The team also did an infrastructure revamp and has an Outreachy intern to work on the manual and video tutorials, you can read all reports in their blog.In 2020, there was a surge of VST3 adoption in free/libre DAWs and sequencers, and judging by what I hear from some plugin developers, they are eager to support it as well. So we might see more of that this year.2021 could also be the year when LSP replaces Calf Studio Gear as universally suggested audio plugins suite on Linux too. Its not like Calf was dead. It’s a combination of several factors like slow development and unsolved DSP issues. And then, speaking from experience, there’s nothing like your eardrum being pierced by output from Calf Reverb to make you look elsewhere.Finally, the amount of available LV2 plugins is getting closer to the point where there’s simply no time to try every new thing. Aren’t we the lucky ones?The hero image is copyright by Emi Martinez, and it’s coming from his animated short film made with Blender. Check it out! Writing this article took close to 20 hours including research, testing, talking to developers, bug reporting etc.If you enjoy the work I do, you can support me on Patreon or make a one-time donation
Libre Arts was previously known under the name Libre Graphics World. It’s an online magazine for creative professionals using free applications for digital painting, graphic and web design, desktop publishing, photography, and CAD. The project focuses on news, tutorials and articles to provide you with the most up to date information about evolution of these applications and best practices.Free software is a rapidly evolving niche, we feel that this fact should be better represented on the web which is exactly what we do. In many ways foundation of this project was triggered by Ginger Coons s talk Ownership and Standards: Why Designers are Slow to Adopt Open Source at Libre Graphics Meeting 2009.We sometimes (seldom, really) cover selected proprietary software that works on Linux, if it’s interesting enough, has no direct free software alternative, and/or its developers contribute to free/libre software. A perfect example would be Harrison Consoles' Mixbus which is a fork of Ardour digital adio workstation, with developers contributing to Ardour in both code and money. Why rebranding Initially, Libre Graphics World was intended to cover just the graphics part of digital content creation software. The idea was that the community would take care of the rest, both audio and video.Despite some community attempts (some of them extremely promising), the expectation fell short. So we had to take matters into out own hands. This, however, led to some confusion because the project’s name had a narrower scope. Hence the rebranding into Libre Arts. There’s a dedicated post with more details. Contacts We are interested to hear from you what topics you would us to cover next, what new interesting applications we missed, what excellent work was done using free software.There are different ways to get in touch with us. First of all, you can email. We also actively maintain a Twitter account where we post interesting things related to free software that don’t qualify for a full-fledged article. The team The project was founded and is currently managed by Alexandre Prokoudine who is affiliated with several free/libre projects such as GIMP and Ardour.The new website was made possible thanks to the PIXLS.US team:Pat DavidMica SemrickdarixKees GuequierreFormer contributors:Maxim Barabash, web programming. Contributor to sK1, PrintDesign, UniConvertor. Twitter: @Maxim_BarabashVera Lobacheva, web programming. Contributor to Open Clip Art Library. Twitter: @summerstyle Copyright policy Unless explicitly stated otherwise, all the content is available under CC BY SA 3.0 Unported license. Feel free to translate and publish it, but please name the original author and link back to the original.
Why I ask for donations Conducting insightful interviews and writing in-depth articles takes a lot of time.Every interview involves researching the subject, interviewing a person, sometimes transcribing audio, then editing the text.Every weekly recap means I grab the latest code of most projects I write about, build from source, actually look at the changes, sometimes ask questions to developers, then write down all I figured out and make sensible screenshots. I also watch new tutorials and go through dozens of recent artworks made with free software to pick the best ones.Hence making this project sustainable involves financial support one way or another. I rely on the community to fund this work. Where you can donate Libre Arts accepts both recurring and one-time donations. Currently, there are three options:Liberapay. For recurring donations, increasingly popular with the free software community.Patreon. For recurring donations, very popular platform for content creators and even programmers.PayPal. You can use it for one-time donations. Please note that this is a PayPal.me page for now, a better solution will be launched next year. How I use your donations The money covers hosting and domain name ownership expenses, website maintenance, as well as the cost of in-depth articles production (research, writing, preparing illustrations etc.).Part of the proceedings also goes to various free software projects like Blender Foundation, Ardour, FreeCAD, and others. If you feel like supporting software projects directly instead, I particularly recommend Blender Foundation, Krita, and yvind Kol s (GEGL/GIMP).Thank you for considering supporting this project!
Week highlights: new releases of BlenderBIM, Shotcut, Surge and liquidsfz, new features in GIMP, darktable, Krita, Blender, ArmorPaint, Olive. Graphics GIMP’s developers mostly focused on file plug-ins last week:Now when you export, file dialogs won’t be visible, the progress will be displayed in the status bar.GIMP now correctly handles the gAMA image chunk in PNG files at importing, displaying, and exporting time.The OpenRaster plug-in now supports progress update for importing/exporting, actually supports layer groups (both importing and exporting), and will set Normal mode to layer modes not supported in ORA (the plan is to write them in GIMP’s namespace later on, the way Krita does it now)For plug-in developers, the build system now generates Python API documentation (GObject Introspected).All that work was done by Jacob Boerema and Jehan Pages. There are some interesting patches by Stanislav Grinkov sitting in the Merge Requests section, like a live update of selected text color. One of them selecting a template from the Canvas Size dialog was actually merged today, I expect the rest to follow soon. The Krita team is also very active:Scott Petrovic returned to hacking on Krita, tuning various aspects of the user interface.Wolthera van H vell started working on support for ICC profiles in SVG files, it’s currently happening in a dedicated branch).Dmitry Kazakov did a lot of resource management work.L. E. Segovia merged the first working version of the GMic-Qt as a native Krita plugin.Sachin Jindal added distance and angle measurement on the canvas for the Measure tool. Photography Thanks to Hanno Schwalm, darktable now supports dual demosaicing, the idea and some of the code coming from RawTherapee. Here’s the rationale, as per commit message:In some images we have areas that would be best demosaiced with an algorithm preserving high frequency information (like amaze or rcd) and other areas that might profit from another demosaicer better suited for low frequency content like vng4.The team also made a kinda predictable announcement on Mastodon last night:To keep up with the current development pace, we have decided to keep releasing 2 major versions of darktable per year: one for summer solstice and the usual one for winter solstice. Those 2 major releases will ship new features, while the dot releases (like 3.4.1, which will be frozen next week) will only ship bug fixes.Siril is getting all sorts of fixes and translation updates in preparation for v0.99.8 release. In particular, Cyril Richard rearranged items on the header bar. 3D The Blender team had a productive week:Hans Goudey did some good work on Geometry NodesKevin Dietrich worked on proxies for Alembic proceduralPablo Dobarro implemented elastic surface falloffRichard Antalik had some progress with fixing broken blend modes in VSESebastian Parborg multi-threaded the action editor for 4x speed increase in certain scenariosSergey Sharybin exposed all UV interpolation options in Subdiv and set a better default UV interpolationSybren St vel worked on the Pose Library design.ArmorPaint got several new features: support for assets packing on exporting (materials and brushes), better layer selection, new Curvature Bake node. Here is a video on the new UV node added a few weeks ago: CAD Dion Moult released new version of BlenderBIM with 110 new features and fixes. Here are the most important changes:Zero roundtripping data lossTwice as fast importing, much faster exporting (350MB large IFC in 20 seconds)UI now adapts to the IFC schema version in useOnly a part of a large IFC can be edited now, and that part will store the authorship/contributor metadataInitial new system for 3D annotationsFor more info, please see the release notes.Johnathon Selstad recently announced CascadeStudio that’s pretty much the OpenCascade kernel wrapped up in JavaScript with three.js front-end. Video A new release of Shotcut is out. Here is what Dan and Co. added:AV1 decoding and decoding.New Advanced mode in the Convert to Edit-friendly dialog with a number of options including an HDR transfer function.New video filter named Reduce Noise: Quantization.As for Olive, MattKC started another sweeping rewrite in a branch that’s now public. Some of the goals are:Greatly simplify node connections (particularly with arrays) so the code requires less maintenance/is more stableRedesign node structure to address issues where UI would stall for lengthy periods of timeAs you might expect, he couldn’t stop at that and did more: implemented a new undo system, rewrote the Ripple tool, improved the Curve Editor dialog etc. It looks like more changes will land to that branch before it will be merged to the main development branch. Music-making Surge 1.8 is out. If you are a Linux user, you will need a VST3-capable host like Ardour or Zrythm, although you can build this softsynth as an LV2 plug-in as well. Release highlights:New and improved skinsNew filters, with multiple new filter modelsMulti-segment envelope generator now works as a modulation sourceLots of Airwindows FX available in the FX chainOver 2,000 presets now shipped with the synthSee here for the full changelog. Stefan Westerfeld released a new version of liquidsfz sampler. Changes:LFO support, both old style (amplfo_*, pitchlfo_*, fillfo_*) and new style (lfoN_freq, lfoN_pitch,…)Support for curve sections and related opcodesMinor fixes and cleanupsBoth tarball and a binary build for Linux are available. Tutorials Evelyne Schulz, a tutorial on painting shiny and matte surfaces with GIMP: Very nice Inkscape tutorial by Zakey Design: And another one from UkrArtDesign: This is good fun with painting symmetry in Krita but even 0.25 playback speed doesn’t always help (and you definitely want to turn off music in that case): New tutorial from Blender Tutor: How to Animate Hair with Hair Dynamics in Blender 2.91 New tutorial from Andrew CADm this time on using the mirror feature, linear pattern, polar pattern and multi-transform in FreeCAD: Artworks Matthieu Coudert posted an experiment illustration made with Alchemy brush in Krita: Animated portrait, made entirely with Krita, by Alartriss: Animated portrait I did of @ jyundee's character for their dtiys on instagram :)Painted in #krita, animated in #blender Speedpaint: https://t.co/gXnTTpdSEH pic.twitter.com/mcO8XFKros— Alartriss (@alartriss) January 30, 2021Gioele Muscolino is having fun with Blender and GIMP to render galaxies: Each of my weekly recaps involves researching, building and testing software, reporting bugs, talking to developers, actually watching videos that I recommend, and only then writing. Time-wise, that’s between 7 and 15 hours. If you enjoy the work I do, you can support me on Patreon or make a one-time donation.
Revolutionizing garment-making in Italy with Valentina
One of the benefits of working on Libre Arts is that people come and tell me about interesting things they do. So recently, I spoke to Luca Lavore who works with his father in their family tailor shop of 30 years in Palermo district, on the beautiful Italian island Sicily. And they are big on using free/libre program Valentina for pattern design.Luca, when you first contacted me, you said you were ready to tell the world that Valentina could be the real fashion revolution. What kind of struggles did you have as a clothing brand / business that led you to believe a revolution needs to happen?In 2016, I started to collaborate with my father in our family tailor shop. Despite our workflow was already oriented at lean manufacturing processes (rather than traditional conservative tailoring), the main phase which is pattern making couldn t be made with computer aided design tools due to the huge cost of those applications and the willing of software houses to never adapt their offer to small artisans. This was, and still is, the main time-wasting cause in pattern/cut process for a bespoke or made-to-measure garment. When I discovered Valentina in late 2018 (thanks to a course made by Sara Savian of WeMake Milan, in Italy), I suddenly understood the huge power of that software, for its nature of free/open source and parametric software. A pioneering project that embraced the power of mathematic functions applied to pattern drafting, totally free; I couldn t have asked for more. When you say “lean manufacturing processes”, what does that mean? I’m familiar with that term for factories, which typically means little-to-zero waste, but not sure about tailor shops.Traditional tailoring maintains practices and methods invented 200 years ago (or more). Drafting with chalk directly on the fabric, a lot of basting stitches, undefined seam allowances, undefined patterns, three fittings with customer, and many more We worked a lot to take inspiration from factories' best practices and bring them into bespoke tailoring. We bring this pattern method to perfection using specific industrial machines and organizing our laboratory with a proper logic. And recently, with the use of CAD tools. When did you start evaluating Valentina?At first, I approached Seamly2D, soon after the split between original project founders. I decided to go on with Valentina project because it is way more responsive to pattern-makers' requests and suggestions. Now my role is to translate the software in Italian, to manage the Facebook group, and I m constantly in touch with programmers to give them feedback and advice.Why do you think Valentina is the real solution to the struggles you had?If you are a small artisan, or a young student, or an enthusiast about sewing, before Valentina came about, you had only one choice: spend a huge amount of money and time to learn expensive and non-intuitive software designed for big factories. For shops that already use proprietary software, what do you think could be the benefits of switching to Valentina apart from the price tag and maybe availability of localization?I couldn’t suggest switching to Valentina to factories, because they already rely on existing CAD and CAM systems that have multiple tools in their suites that allow to manage production from sketches to cutting with automatic machines.The best way they can use Valentina is for a quick patterns making to export in DXF and import to corporate software, or to benefit from parametric design in Valentina to better approximation of grading instead of the “X/Y coordinates” one. A professional pattern-maker I know, who is based in Turin and is owner of his own pattern books and method, uses the software just for testing the parametric functioning of his models, before to output them in Investronica software.Those whom I recommend to take Valentina seriously are teachers, students, freelance designers, and small clothing production companies that want to take a step into the future. I hope they will do; after all, if we did it in the remote heart of Sicily, why others can’t?What’s different in your workflow after beginning to use Valentina? And how did it affect your business?My workflow now it s the most advanced in Italy among small businesses of tailoring. Besides the time we save to make clothes, we also use less fabric due to digital layout (here comes Inkscape) and our know-how in high-end pattern construction led us to provide our files to a factory here in Sicily that works with a famous CAD software. For the future, we are thinking of launching a website to sell our digital patterns. Do you mean you sell them designs for mass production?Yes, factories often don t have a full-time internal pattern-maker. So, if they want to be ready for new requests made by independent designers, they might need customized patterns. We can do this, with Valentina, and that s saying something! How do you exchange design files? DXF? SVG? PDF?There are lot of ways to exchange files. Normally CAD programs work well with DXF-AAMA, but sometimes it could cause data loss; so a flat DXF becomes very useful.How much are you involved with testing the software and providing feedback to Roman?Roman allowed me save a lot of time in my pattern drafting: two years ago Valentina lacked simple features which had big significance for productivity. He has the mindset to follow the right feedback and knows how to implement new functions that work perfectly, so he heard my questions and trusted in me.Since I use the software every day, I chat with Roman every week to exchange news, so I m one of the few main contributors in the front row about the direction of Valentina. You also mentioned that you make some use of Inkscape for modifications and layouts. Could you please tell me more?At time of writing, Valentina has a tool to automatically place pieces on a layout, with preferences given by user; and for someone who cuts pattern pieces individually to position them after on the fabric, it could be enough.But since we use professional cut practice, we need to place manually every pattern pieces in the digital sheet area (wide and long as the fabric is) before printing, very close to one another and with a particular arrangement. This is still not possible with the software, so I do that with Inkscape. Valentina will soon be able to do that with a new bundled tool called Puzzle. I can say that Inkscape is nearly as important for me as Valentina is. Sometimes I need to draft or correct parts of my patterns with Inkscape because I want to manipulate or merge them in a way that Valentina doesn t yet allow me to do. And I m very happy to have these two outstanding applications for my work!You can see more of the work Sartoria Giuseppe Lavore do in their Instagram account.
Optimize PNG and JPEG images on Linux with Curtail
There is a lot of push towards better image file formats all around, especially on the web. And while personally I have already moved to WebP for all new content on Libre Arts, there are some valid use cases for PNG and JPEG.You could be sharing a screenshot created by your desktop environment when you press the Print Screen button. Or maybe you have to use an old content management system that does not accept newer image file formats. Either way, it’s a good idea to compress images before uploading.And if you never liked using tools like optipng from a terminal, I have good news for you. Apparently, a while ago a French developer Hugo Posnic created a tiny application called Curtail that is pretty much a simplistic user interface for optipng, pngquant, and jpegoptim. Here is the video, the transcript goes below. I saw Garrett LeSage mentioning that to a fellow Red Hat designer M ir n Duffy on Twitter, so thanks to you both :)You can install Curtail from Flathub or you can build it yourself which is an exaggeration because it’s just running meson and ninja over Python code, there’s no compilation involved per se.Now, this is almost everything there is to see in this program: It’s just foolproof, it’s what you probably have grown to see in online services like tinypng.com.You select whether you want compression to be lossy or lossless, and then you drag and drop files onto the main window (you can drop just the folder, in fact). Then they automatically get compressed, and you can see here names of files, old size, new size, and how much you saved in percents. Once compression is done, you can click the left arrow button here to go back to the main view.Curtail does have some options. Go to the header bar, click the three-dots button, choose ‘Preferences’. You can enable or disable the preservation of metadata which you don’t need for things like icons but probably want for photos.You can choose whether you want to overwrite original images or create new ones, in which case you also need to specify the text that will be appended to the base file name. Curtail suggests this -min text by default which sounds like a reasonable default.You can also force Curtail to use a dark version of the user interface theme, if there is one.Now, ‘Compression’ and ‘Advanced’ pages both have options for setting compression level. The former one is for lossy compression of PNG and JPEG: And the latter one has the setting for lossless PNG compression that is basically a trade-off between making smaller files and compressing faster: And now that is really all there is to see here. It’s a nice little app. It does one job and it does it well. It’s entirely possible that adding some more compression settings would be helpful, so if you feel that way, the project has a bug tracker on GitHub for feature requests.
Highlights: new releases of Zrythm and Guitarix, new features in Krita, darktable, Siril, Blender. Graphics Niels De Graef continues bringing more GTK3-specific features to GIMP. The program now uses a listbox widget for icon themes and title bar formats. And thanks to Stanislav Grinkov, you can now reverse the order of pages as layers when importing PDF.<img src="https://librearts.org/2021/01/week-recap-25-jan-2021/sw-gimp-reverse-pdf-order.webp" loading="lazy" alt="Reverse PDF pages order in GIMP" width="50%"/> There’s a lot of bugfixing in Krita following v4.4.2 release but also some new things: the context menu for on-canvas crop tool now provides options to lock width/height/ratio, and then there are some styling improvements all around. Photography darktable now has with a new darkroom tab: basic adjustments. Basically, if you only ever change e.g. white balance, exposure, contrast, and ‘strength’ in the velvia module, you can pull out these sliders and make them accessible in a single place. No need to navigate modules at all. If you ever used Ardour extensively, you will be nodding right about now, because it’s exactly like adding a plug-in’s setting as a slider to a mixer’s channel. Except instead of right-clicking on the plug-in in the mixer, you do it from the dialog where you configure module groups. The Siril team seems to be switching to bugfix mode for 0.99.8 release: New in dev, maybe last feature before the release:Very convenient for finding small objects in wide field, like the star of an exoplanet.#photometry #astrometry #astrophotography #FOSS #opensource #science pic.twitter.com/BzNYBHRGvR— SiriL (@SiriL_Official) January 25, 2021 3D Blender 2.92 isn’t out yet but there’s already even more new stuff to be available in 2.93: more drag’n’drop support in the Asset Browser, new Attribute Sample Texture node in geometry nodes, improved Object menu organization and naming, and more. See today’s meeting notes for details or watch this video: Music-making New alpha release of Zrythm is out with Modulator macro buttons (video by Zrythm team). Your browser does not support the video tag.Robin Gareus recently started work towards M1 and macOS Big Sur support in Ardour, while Paul Davis keeps hacking on the nutempo 2 branch (bringing all sorts of MIDI fixes). The rest is plumbing work.Guitarix 0.42.0 was out recently and now features a rework of tube emulation, contributed by Damien Zammit. This makes the simulation more believable, with better dynamic responses. It will break existing presets, although developers are confident that users will love the new sound a lot more than they will dislike updating presets. The release tarball is here.Rob van den Berg released the first couple of versions of Drops, a simplistic sampler that allows loading a WAV files, cutting it, setting notes for clips, and tweaking filters. The second one adds pitch tune and oversampling. Tutorials A GIMP photo manipulation tutorials from Codingcreator: Really quick tutorial on drawing a penguin with Inkscape, the part with shading on the beak is a bit wtf though :) You might want setting 0.25 playback speed on this Krita painting timelapse. Fortunately, the video has 60 frames per second rate, so that’s bearable. Okay, this is positively crazy: procedural galaxies in Blender: Artworks New interior design render with Blender, by George Turmanidze: Great mini story about an evil sloth, by Juan Hern ndez, made with Blender: Chris Hildenbrand did probably the funniest take for the hat challenge on Inkscape users' group on Facebook. Meanwhile, dillerkind keeps drawing cute characters with Inkscape: New winter landscape by Philipp Urlich: Each of my weekly recaps involves researching, building and testing software, reporting bugs, talking to developers, actually watching videos that I recommend, and only then writing. Time-wise, that's typically between 10 and 20 hours. If you enjoy the work I do, you can support me on Patreon or make a one-time donation.
Evelyne Schulz is one of the artists whose work made with GIMP is what I usually show to people to demonstrate that GIMP is a quite capable tool for digital painting. So I guess showcasing her work here was long overdue! She also recently returned to making GIMP tutorials and posting them on YouTube.Hello Evelyne! Thank you for agreeing to do this :)Hi there! Yes, thanks for reaching out :)Some years ago, you did an interview for OCS-Mag and you basically said you grew up with Linux, and GIMP was the only choice you had back then which was, what, 1997?Yes, I think I started using GIMP on KDE when I was 12 or 13, so 1997 would be about right.And then you chose to study for Mediamatiker which is a cross between graphic designer and an IT specialist. And now you work for a print company. So I’m sorry if I’m overanalyzing, but basically earlier, clunky versions of Linux and GIMP defined your career in media arts and you still use GIMP for fun and profit? While some people would argue till they are blue in the face that if you (only) know free software like GIMP you won’t get any far in the industry? :)As you stated correctly, I’ve been in the graphic and print industry for a long time, so of course I have also access to the whole palette of Adobe products, including Photoshop. Many of our customers work with these, so often this is the way to go. I still have GIMP on my computer at work as well though.<img src="https://librearts.org/2021/01/the-art-of-evelyne-schulz/evelyne-schulz-shadow-world.webp" loading="lazy" alt="Shadow World"/> Shadow World For my own art I almost exclusively use GIMP, that’s where I feel “at home”. It’s easier to use, even though people keep telling me that GIMP is complicated and they can’t figure out how to use it. I think it’s mostly about what you are used to, and of course Adobe is really good at providing school licenses and trials, so children and students often grow up with Photoshop, and have a hard time switching to a different software later on.How much do you use advanced features of the brush engine like response curves for pen pressure etc.?Not at all to be honest it is something I should invest some time in soon, but my life is incredibly busy these days. Sometimes all you can do is following “business as usual”. I’m very excited about all the possibilities, and I hope that I will be able to experiment a lot more with them, once I’m in a more stable phase work-wise. What kind of changes would you want GIMP to have to better suit or even improve your workflow? Like, is there something that would really boost your productivity?As often requested, CMYK support would make a lot of things easier, especially for me of course, since I’m working in the print industry.Regarding art, it would be awesome to have a color palette dialogue that can be placed somewhere. So you can switch the foreground color with one click, and have a nice overview over the colors you are using. (’m not sure if this actually exists, but if it does, I don’t know where.<img src="https://librearts.org/2021/01/the-art-of-evelyne-schulz/evelyne-schulz-hope.webp" loading="lazy" alt="Hope"/> Hope I see that you also dived into using Blender, at least judging by the “Pronunciation is hard!” video. But I was a little surprised to see that you still use 2.78 while most artists I know quickly switched to 2.8x because of all the UI improvements. What made you stick with the older version? And how far are you willing to explore 3D?I’m also a programmer and a huge fan of VR, and back in 2019 I started building a VR open world game in Unreal Engine 4. Basically that’s the real reason why I started using Blender.At a later point I was commissioned to do a piece of art with a background where perspective matters, so I thought, why not just do these elements in Blender? And I kind of liked the fact that I didn’t have to think as much about the perspective anymore. Personally I’m not much into pure 3D art. Of course there are brilliant artists that do stylized renders or extremely realistic images, but I always feel like there is some “life” missing no “happy accidents” as Bob Ross would have said.Depending on what pieces I will do in the future, I will definitely work with Blender some more I’m not experienced at all though, and it will probably always be as a base, with finishing touches in GIMP.By the way, I’m using the new version of Blender now. The only reason I was using the old one is that I’m working on an old MacPro from 2003 that could not be upgraded to a new OS anymore. I recently upgraded to a new graphics card to solve this issue though.As parent to parent… How do you make time for art? :) Because in one of your videos, where you show how to shade a sphere, you talk in the kind of a soft voice that I use when kiddo is asleep in the next room, and I really need to record my voiceover :)Haha! Well, my children are both teenagers, and I often go to bed before them. I’m autistic and very soft-spoken by nature, so voiceovers are a challenge for me. Time can be a big issue, yes. In my case it’s not because I have to watch my children anymore, but usually, working a full time job doesn’t leave much spare time, and there are also everyday chores at home. I think it has mostly to do with priorities. If you really want to create art, you will be able to make room for it. In my case I find that creating art calms me down and charges my “mental batteries”, so I try to give it a high priority.It can help to make the conscious decision to paint at least some, maybe just 10 minutes. I often find that simply starting a piece can be very motivating, and you might end up working on it for an hour instead.I’ve been tracking your work ever since the ‘What does the fox wizard say?" painting. Which was done by you shortly after the Yilvis song, I think, but I vaguely recall the origins were different and there was a fan poll involved :)Right, the “Wizard Fox” was really fun to do. It’s old, and I could do much better today, but I still like the piece a lot because it’s so funny.<img src="https://librearts.org/2021/01/the-art-of-evelyne-schulz/evelyne-schulz-wizard-fox.webp" loading="lazy" alt="What does the wizard fox say"/> What does the wizard fox say Back in 2014, I was much more involved in the Tumblr community, and I did a poll where my followers could suggest things I could paint, and then vote for the different options. One person came with an obvious troll suggestion, which was “a wizard fox riding a dinosaur with thigh high boots and a false mustache”. Instead of just discarding it as a joke, I took it as a challenge. I might have cheated a bit with the thigh high boots, but the dinosaur actually got a false mustache. ;DWould you believe it it’s still one of the go-to pieces for me to show people that you can make great art with GIMP :)Hah, that’s actually amazing. :) I’m glad you like it that much. I just adjusted it a tiny bit color wise, and reposted it on my dA page.So what are your usual triggers? What do you strongly respond to as an artist?Sometimes I do commissions, so I need to follow the customer’s instructions and ideas. When doing art for myself, I usually want it to tell a story, or depict a beautiful or mystical place, where a story could start.I’m a daydreamer. Sometimes I’m on a walk and see something and get an idea, sometimes I hear or read a story and I work with the images it inspires in my head. Nature inspires me a lot. In my head I combine it with imaginary places when I need a break from the real life, for instance when out on a run, or in bed at night.<img src="https://librearts.org/2021/01/the-art-of-evelyne-schulz/evelyne-schulz-not-far-now.webp" loading="lazy" alt="Not far now"/> Not far now It also happens that I start creating a painting, and I get stuck, and then I show it to my children. They sometimes come up with fresh ideas, because they see the piece from a different perspective. My newest piece (will be finished today) is a prime example, where my son suggested to add some life by placing a dwelling, and it just made it much more interesting.You also have quite a bit of Loki in the gallery, probably due to working on covers for the “Fragments of your Soul” book series. How deeply ingrained do you think is the Norse mythology in your culture?Yes, my gallery is still kind of dominated by Loki - but I guess that will slowly begin to change now.Here in Denmark people usually know quite a bit about Norse mythology. It is an important part of history and culture. Thor is a very common name here, and you often see streets named after places or characters from the mythology. It’s normal that you find it everywhere, and people from different places, Asia or the US for instance, are not used to it.So, like, for instance, my local Aldi store has cheeses named after Loke, Odin, Thor, Balder and more, the brand is called “Asgaard”. So I can take a picture of cheap cheese in Aldi and post it online, and some people go nuts about it. :) I think that is kind of funny.<img src="https://librearts.org/2021/01/the-art-of-evelyne-schulz/evelyne-schulz-loki.webp" loading="lazy" alt="Loki"/> Loki I’ve seen a lot of art created by beginners that lacks what I would call personality. In the analog world, sometimes it’s a matter of picking an interesting and maybe challenging technique that one really likes and then working hard on the skills. What was your way to develop your own style? How did you challenge yourself and how much do you still do it?That’s true (the lack of a “personality” or style), but I also think it’s a natural part of the journey. I do believe it takes a lot of experimenting, and you develop a style automatically over time and it can take a long time, depending on how much time you spend creating art.For my part, I copied a lot of other artist’s styles, just to explore them. I started drawing characters like in Anime films when I was about 10 or 11 years old, drew mangas, copied different mangaka’s styles, then started to develop something like my own anime-like style. But that was just a phase. In art school we had to try out different techniques and styles, it was part of the curriculum.<img src="https://librearts.org/2021/01/the-art-of-evelyne-schulz/evelyne-schulz-the-girl-and-the-giant.webp" loading="lazy" alt="The Girl And The Giant"/> The Girl And The Giant Personally, school was so stressful for me, that I pretty much stopped creating art in my spare time, and I think it was one of the reasons why there was not much of a development regarding a personal style. I do believe art teachers should leave more room for individual experimentation.I don’t think I could see any “personality” in my own art before 2014 or so. And I truly believe it’s mostly about experience and a natural part of the development that comes with it.What do you think is critical for any aspiring artist to become really good?I know it’s boring, but like most artists say: the more you practice, the more you improve and develop your technique. And it’s still true. Tutorials can be very helpful, and YouTube is stuffed with them but remember to practice the shown techniques yourself. Don’t wait until you feel “inspired”, because inspiration might never come on its own. Sometimes you just have to force yourself to sit down and start doodling.<img src="https://librearts.org/2021/01/the-art-of-evelyne-schulz/evelyne-schulz-the-ice-wastes.webp" loading="lazy" alt="The Ice Wastes"/> The Ice Wastes The same goes for the feeling of having an idea, but not being sure about the details, so you don’t start. You might never be sure how exactly you want to express your idea, so just seize the next opportunity and give it a shot. Maybe you will scrap that again, but it will help you get started. You wouldn’t believe the number of people that tell me that they would love to get into art, describing all their amazing ideas but never actually sit down and get started.Many children and beginners also tell me that they lack the skills to express their ideas. So when they try to turn their idea into an image, the result is disappointing and not what they wanted and imagined, so they get frustrated and give up eventually. But you can’t develop these skills without trying.So even though the results might not be what you wanted in the beginning, these are not failures. These pictures are natural steps in a long process of improvement, and if you keep going and appreciate this, one day you will be able to turn your inspiration into the pictures you imagine.Who are some of the artists that you look up to?These days I love the works of Andrei Riabovitchev and Ilya Drokonov.You can see more of Evelyne’s work on ArtStation and DeviantArt, and subscribe to her YouTube channel.
Sunsetting Libre Graphics World, introducing Libre Arts
My initial plan to stick to just graphics software went pear-shaped mere months into the journey with Libre Graphics World back in 2009. So this is a long overdue goodbye to the old project. What’s going on Covering music/video software on a website that says “Graphics” was a bit of a puzzler for some readers. I think Paul Davis of Ardour fame stopped chuckling over that maybe just a year ago. I wouldn’t say the name used to be a huge branding issue (I have fellow GIMP project for that, you know). But it became slightly vexing.And then there was the whole topic of using ExpressionEngine for CMS. My initial choice (against Wordpress) was made back in 2011 primarily with security concerns in mind, but I did not investigate EE’s theming capabilities as well as I should have. And switching/updating themes with EE turned out to be a complete and utter disaster.Finally, last year, the CMS started working erratically. I wouldn’t be able to upload any images, and sometimes posts wouldn’t show up on the home page. It was clearly time to pull the plug, and retiring EE along with the old domain name looked about right to me. Whodunit In fact, I’ve been meaning to do the switch for several years now. The original idea was to go with a Ruby-based static website generator called Jekyll. But since I didn’t have a lot of spare time, I’d do some work, and then when I returned to do some more, every extra module I had used wouldn’t work with the newer version of Jekyll. This went on and on, and eventually Pat David suggested to help me porting to either Pelican or Hugo instead. We settled for Hugo and a custom magazine-like theme based on bootstrap.At the very end of 2020, stars finally aligned. Pat hacked on the theme throughout December like there was no tomorrow, which let’s agree was quite fitting for 2020. Apart from Pat, the whole Pixls.us team got involved: both Mica and Darix participated in setting up CI for Gitlab and fixing various issues, and andabata set up a Matomo account. After a lot of fiddling, the new website went live right around 31 Dec / 1 Jan midnight. Nerds :)Given my pace and lack of real programming skills, I couldn’t have done it without the Pixls team! Who are now actively hacking on their next project which is the new website for Siril, the free/libre astrophotography program. What’s new with Libre Arts Some technical information about the website is available on the colophon page.The layout is considerably magazine-like, I’ve more decisions to make there, and anyway, at this stage, I’d like input from more people. The logotype is a placeholder, I will take care of that in due time.The taxonomy is rather different than that of LGW. There’s weeklies that seem to be somewhat popular among readers, a tutorials section (that I will later expand, hence the category name ‘Education’), showcases, reviews and interviews, and then, of course, the podcast (the next episode is at the editing stage). But a lot of people are only interested in specific topics, so I assign categories like “3D”, “Video”, and “Painting” to posts as well. Finally, all posts are tagged with “projects” which gives the opportunity to expand dedicated pages.The comments section is hosted by discuss.pixls.us, so you need an account there. I feel a lot better about using a system hosted by friends rather than something like Disqus. What’s next So this is what I’ve been busy with these past several weeks, apart from work and family.In its current state, Libre Arts is more like a beta. There’s styling to be finalized, nice hero images to be added, but more importantly about 1,000 posts still to relocate from one domain to another and redirects to be placed accordingly. Which means that within the next month or so, the RSS feed will be unusable. And I will need to wrap up LGW gracefully.Moreover, after a game of hot potato, I’m now in the process of becoming the owner of libremusicproduction.com. As much of its content as possible will be merged into Libre Arts in the coming weeks.As usual, I have a bunch of drafts for new posts, so stay tuned for updates! And please let me know how you feel about styling and layout decisions.
Highlights: new releases of Krita and Shotcut, new features in Siril, Blender, Olive, some interesting ongoing development in darktable. Graphics GIMP developers have been doing some bugfixing, especially in the metadata editor code (mostly done by Jacob Boerema). Jehan patched the Input Devices dialog to remove some cruft such as listing virtual devices.Krita devs released version 4.4.2 (this week, technically, but what the hell :)). It comes with a metric ton of changes, mesh gradients, mesh transform, new gradient editor and more, and there are more improvements in the pipeline. I’d say, the patch by Dmitry to add in-stack transformation preview is one of the most important ones. Let’s hope it will make it to the next release soon enough. Photography Darktable has been in feature freeze since November for the coming v3.4 release scheduled for winter holidays (as usual). So there’s no new stuff in the main development branch. However people are busy writing new code in forks and making pull requests.One such new feature is applying lens correction from metadata that Sony and Fuji (APS-C) cameras write into raw photos. I happen to have a Fuji around, but the module requires unreleased version of Exiv2, so no luck right now.To a large extent, the module duplicates features of another existing module that uses the Lensfun database, and judging by this Pixls thread and a conversation with Aur lien Pierre, this new feature will most likely have to be moved to the existing plugin instead.Siril keeps getting more tools for version 1.0. What's new in dev?Astrometry, annotation, ... fun tools.#freeastro #opensource #astrophotograhy #astrometry pic.twitter.com/a44mqWsRJO— SiriL (@SiriL_Official) December 2, 2020 3D Geometry nodes landed to the development branch leading up to Blender 2.92. Here is a video on that: Meanwhile, the 3D World magazine released a special issue dedicated to Blender. Video Matt KC posted November update for Olive. Highlights:Renderer rewrite with abstraction layer so that Vulkan/Metal could be easily addedOpenColorIO v2 support, bringing accurate GPU-side rendering at export timeInternationalization support has been reintroducedThe transform node is back with better on-monitor handles to transform the video and interpolation settingsSee here for more.A new Shotcut version is out with minor new features and improvements, see 20.11.25 release notes for more info, but grab the 20.10.28 release for latest fixes. Tutorials You might find this overview of denoise modules in darktable useful, all done by Stefano Ferro: In an unbelievable turn of events, Griatch is back with an annotated Krita timelapse: Here is a very cool speedpainting timelapse in GIMP, by JTRox2020 PH: We can argue whether drawing over photo references is worse than the original sin, but this is a nice timelapse of doing exactly that with Inkscape: Or here is an original art timelapse from Istv n Sz p: Some winter holidays FreeCAD goodness from Andrew: Artworks I never get tired of posting art by Philipp Urlich (he uses Krita): This one is by Seefat, also made with Krita: Tom Carlos, “Winter Oak”, made with Inkscape: Chris Hildenbrand’s unusual yeti-flavored take on the ‘Snowman’ challenge in Inkscape’s Facebook group: Each of my weekly recaps involves researching, building and testing software, reporting bugs, talking to developers, actually watching videos that I recommend, and only then writing. Time-wise, that's between 10 and 20 hours. If you enjoy the work I do, you can support me on Patreon or make a one-time donation.
Week highlights: new major releases of Blender, Font Manager, DrumGizmo, amSynth, new features in Krita, Olive, new node-based image editor called Cascade. Graphics A lot of work on GIMP last week was plumbing but it was the important kind of plumbing. File plugins now begin to use new program-wide API for saving metadata instead of using their own code that has variations and inconsistencies among themselves.Krita got a new gradient editor: The Mesh mode of the Transform tool now has spinboxes to set rows and columns, and a checkbox to show/hide control points. Also, the Similar Color Selection tool can now work on either all layer or all layers with labels, not just the current layer.There’s a new node-based image editor called Cascade around. It’s all GPU-side processing based on Vulkan (save for G’MIC in git), 32-bit float per channel precision, Qt-based user interface etc. The latest alpha release only has a Windows binary. It’s theoretically possible to build on Linux but I’ve had no luck with either the release or the latest code in the git repo. Do check it out though. And there’s a Pixls thread to talk to the main developer to. Font Manager 0.8.0 is out with Google Fonts integration, saving comparison sets, Unicode 13.0 support, and more improvements. Photography Been a while since I looked at RawTherapee. Quite a few improvements merged to the main development branch recently, including a vectorscope and a waveform histogram. 3D Blender 2.91 is out. Some of the highlights:a metric ton of sculpting improvements including simulation of cloth crinkles etc.;custom bevel profiles for curve and text objects;a new modifier to convert volume to mesh, which is great for stylized fluids, and a convertor of meshes to volume;Potrace built into Grease Pencil for fast bitmap tracing, and more Grease Pencil goodness;property search field, great for beginners and, generally, users who hate scrolling around for one checkbox;complex simulation of rigid bodies collision;color tags support in the outliner for better organization.For more information, see release notes and this video: Jeremy Hu keeps doing amazing work with Dust3D: Introduce a new file format: .ds3objCompare to .glb/.fbx, the new .ds3obj file contains not only the result mesh, rig, motions, and textures, but also includes the node, edge and other original infos from the document.https://t.co/69qJoja1Ud#Dust3D pic.twitter.com/rSTT44aGII— Jeremy HU (@jeremyhu2016) November 22, 2020 Gamedev Godot 2D Navmesh Generator Alpha is RELEASED! It (hopefully) allows easy and quick creation navmeshes for your 2D games. Check it out https://t.co/ZSxBjtLl1d and let me know about issues (I expect there are lots). #GodotEngine #gamedev #indiedev #screenshotsaturday pic.twitter.com/YoJJ3hqjeU— Sam Bigos (@Calneon) November 29, 2020 CAD The Open Compute Project announced CADCloud which is essentially a CAD/EDA files server and a workbench for FreeCAD 0.19+ to access files on that storage.Dion Moult added a blog to the OSArch website to cover the topic of using free/libre software for architects.The schematic editor in LibrePCB now supports standalone text items (you can add, modify, and remove them). Video Quite a few bug fixes and improvements landed to Olive, but also some features:transparency support added to transitionsbeginnings of a new record tool (Olive had one in 0.1.x)undo/redo now available in the text editorbetter UI feedback when cancelling a taskbetter rubberband selection responsiveness.Kdenlive got some fixes for subtitles.There’s an OpenTimelineIO exporting add-on for Blender VSE by tin2tin available now. Music-making DrumGizmo 0.9.19 is out. Here are some of the changes:Default midimaps now read from drumkit file.Sample selection default values improved.New powermap feature.Plugin UI can now be translated into various languages (gettext-based).Per-instrument voice limit now available.See here for more info.Hermann Meyer released the first public version of Fluida.lv2, a sampler based on FluidSynth. Grab the source code tarball here.Nick Dowell released a new version of amSynth. It’s been a while since I last looked at it, here’s a quick overview of recent changes:Mouse wheel support for controlsUI upscaling for background and controls on HiDPI displaysPresets are now available to be loaded in VST hosts using the generic GUIImproved HiDPI autodetection and added --force-device-scale-factor command line optionTarballs are here. Tutorials Marta Gvozdinskaya, an Inkscape tutorial on creating a fabric layout for better pattern placement and making the most of the fabric: Another Inkscape tutorial, from Ahmad Arzaha Krita tutorial by Pallab, good luck looking another egg in the face now! :) Artworks Totally missed this new series by Ngan Pham, made with Krita: Emilis Baltrusaitis put Google Maps and Blender (as well as some proprietary tools) to an interesting use, see comments in the post for how-to details: New epic render (Blender) by Rutger van de Steeg with a strong cyberpunk vibe: Each of my weekly recaps involves researching, building and testing software, reporting bugs, talking to developers, actually watching videos that I recommend, and only then writing. Time-wise, that's between 10 and 20 hours. If you enjoy the work I do, you can support me on Patreon or make a one-time donation.
Highlights: new releases of GIMP, Scribus, FontForge, Synfig, Tahoma2D, LibreDWG, Giada, Ardour, new features in Krita, Inkscape, Olive, Kdenlive, a new take at fullscreen color management with HDR support in Wayland. Graphics There have been two major anniversaries lately.GIMP turned 25 years old last week and released version 2.99.2 a few weeks prior to that. It’s the first unstable release leading up to v3.0. It was 20 years for FontForge, and another release: Today is FontForge's 20th anniversary. Our latest release, 2020-11-07, celebrates twenty years of FontForge.Other changes: Many SVG import bugs fixed, Windows drive letters accessible in open dialog, and Python 2 deprecated.https://t.co/eS0EV56otH pic.twitter.com/VoOvspVqrX— FontForge (@FontForge) November 8, 2020The Krita team merged the last GSoC 2020 project this year, the storyboard editor. There’s a nice walk-through in confifu’s blog. Developers also released their own fork of SeExp, called KSeExpr. According to Boudewijn, they couldn’t get hold of upstream developers regarding various fixes that the Krita team had done. They are in contact now, and KSeExpr has dual licensing to make their changes available to upstream. Your browser does not support the video tag.Scribus 1.5.6 is out with PDF-based printing (to replace PostScript-based printing eventually), PDF 1.6 exporting, and some more niceties. Here is our video, and you can also read its edited transcript. I spoke to Craig Bradney about the future v1.6. release. Here is what he told me:We haven't yet come to the final conclusion what v1.6 should be but I think we re getting close. Any form of major UI revamp or changes should really wait now. There s tables that we d love to finish but we ve got some interesting work going on.Jean has stabilized the notes work a lot. He is going to review the new PDF importer code from a new contributor and he is rewriting the copy/paste code right now to make copy/paste between documents work more reliably and cover some import gaps.I have started working on an API generator script that will give us shells of API commands to be filled in with functionality inspired by Ale s work on some APIs in some way. The aim being we have an API to call and replace all the active code in scripter, and also allow other scripting languages and even internal Scribus code to use the public API where it makes sense. We can really pick a point to release, but just need to decide.The Inkscape team merged the new dockable dialogs system written by Valentin Ionita over the summer (it was one of the GSoC projects). Pekka Paalanen announced ongoing work on fullscreen color management protocol with HDR support for Wayland. This is not the first attempt. RIchard Hughes mentioned that back as 2012 (here is my interview with him). The idea is to rely on ICCv4 color profiles with an outlook for iccMAX (which they probably should have sticked to from ground up). Photography Pierre Aur lien finally merged his color mixer module to darktable. The module is now available under the name of ‘color calibration’ and works sort of similarly to what you can find in Lightroom and probably saw a few times in videos like ‘make your colors POP!’, ‘your editing will never be the same again’ etc. You can support him on Liberapay, by the way. Harold le Cl ment de Saint-Marcq implemented unbounded blending modes, RGB scene-referred parametric masking in JzCzhz color space, and a boost factor to set thresholding above 100% to work on HDR images.Filmulator 0.9.1 was released with some workflow improvements. Animation Synfig 1.4.0 is out with a crazy amount of improvements and new features. Some of the release highlights:Editable curves in Graphs PanelWaveform display and elimination of sync issuesTransform tool got a control point to change transformation originAnimation of any parameter can be baked nowTimeTrack panel now allows setting playback range and looping the playbackSimplified importing of image sequencesPort of the bitmap vectorization tool from OpenToonz I also completely missed the latest release announcement of Tahoma2D, a recent fork of OpenToonz by Jeremy Bullock who used to be one of the most active OT contributors. Version 1.1 comes with what appears to be a huge blob of small improvements all over the place, including but far from limited to these:Vanishing points and advanced straight line drawing on vector levels.The vector brush can now draw behind, auto close and auto fill strokes.Bitmaps copy-pasting between Tahoma2D and other programs.Linux builds and support for webcam on Linux for the stopmotion feature set.Boris Dalstein announced some interesting improvements in the Animated Cycle Editor in VPaint. See his Patreon post for details. 3D Blender Foundation announced that Facebook will be joining the Developer Fund in the Q4 of 2020. Facebook develops its own augmented reality tools and ships an add-on for Blender. As is usual in such cases, some people started freaking out that Facebook is up to no good.Here’s my take on that if you care to read. There is no public information about how much they are giving, only that it’s above $300 a month. Facebook’s marketing budget for 2019 was close to $9.9bln. Even if they donated $1mln (which is unlikely, it would be a huge story), that would be nearly 1/10000 of the whole marketing budget. This is peanuts. They would lose a lot more than what they spent if they were caught trying to manipulate Ton.None of the embrace-extend-extinguish theories I’ve read so far look remotely realistic. So this is simply a win-win case: Facebook gets good karma (that they really need) for what is effectively a few cents for them, and Blender gets some more funds. CAD Reini Urban released LibreDWG 0.11.1 with a bunch of fixes. The library can now also be included in CMake projects. For the list of changes, please see here.Yorik van Havre posted his October report for FreeCAD development. Highlights:SVG hatches support in the TechDraw workbenchNew style editor for the Draft, Arch, and BIM workbenchesImagePlane tool now available in the BIM workbenchClean-up in the Material editorBy the way, people started testing the Turning add-on for FreeCAD that uses LibLathe to enhance the Path workbench and control a CNC lathe. SolveSpace is getting some much needed love lately, mostly from its current new maintainer, Paul Kahler.There’s a new release of BlenderBIM with 52 new features and fixes, including improved material and presentation layer support, improved geolocation support, and various workarounds for IFC files generated by Bentley ProStructures. See here for more details. If you are interested, what’s next for Ladybug tools, here is a long-ish video: Video W.P. Morrow aka Good Guy, lead developer of Cinelerra-GG, passed away after being hit by a truck while cycling in early November. He was 66. Phyllis is taking over her late husband’s work, the mailing list seems to be active but so far no new commits to the git repository.MattKC posted a monthly update on Olive development and laid out basic plans for the near future. The next version, v0.2, will focus on delivering functional cutting and exporting, with proxies, caching, and everything. It might also get a basic color grading node. Then v0.3 will have further node editing improvements and better color management. Most recently, Matt brought back Transform and Mosaic nodes, as well as localization support, and rewrote exporting code. Meanwhile, Thomas Wilshaw is busy improving OpenTimelineIO support.Kdenlive team merged the subtitle editing GSoC project of Sashmita Raghav. It’s not exactly all functional right now, but since we are talking about the master branch, I’d say it’s safe to expect this working in the next release later this year. At least, it’s available in recently released beta of v20.12 along with same-track transitions. Music-making Giada looper is out with VST3 support and minor UX/UI improvements. There’s a changelog in the downloads section on the website if you want to know more.Ardour 6.5 is out with VST3 being the major new feature. As you can see, I’m using newly released Vital softsynth my Matt Tytel. Everybody was wondering if he would release source code but that doesn’t seem to be the case so far. Nevertheless, both LV2 and VST3 version are available. And, since he didn’t even promise that, welp, at least this is not Lightworks all over again.There’s another alpha release of Zrythm available, mostly with minor improvements. Tutorials Fred Brennan posted a new FontForge tutorial about creating a variable font with non-linear (higher-order) interpolation: UkrArtDesign, Inkscape: Making a lightbulb with Blender: Artworks New Inkscape artwork by Sven Ebert, reportedly vandalized by Sonya Benett. You can find the pure and unadorned version here :) Some speedpainting practice from Philipp Urlich: Sylvia Ritter published another book cover commission she recently did with Blender and Krita. Syed Waleed Shah posted more artwork made with Krita. Each of my weekly recaps involves researching, building and testing software, reporting bugs, talking to developers, actually watching videos that I recommend, and only then writing. Time-wise, that's between 10 and 20 hours. If you enjoy the work I do, you can support me on Patreon or make a one-time donation.
New unstable version of Scribus is available with several interesting improvements such as PDF-based printing, PDF 1.6 compatibility with OpenType fonts embedding, and more changes.This is originally avialable as a video, edited script goes below. PDF-based printing and preview Scribus developers started introducing PDF-based printing support and intend to eventually phase out PostScript-based printing. The two places where you can find this new option are the printing dialog itself and then, of course, the print preview dialog. The Windows version doesn’t have it yet though. PDF 1.6 compatibility This version of Scribus also comes with support for PDF 1.6. This revision of the standard is really not a new one, it was out back in 2004.There are multiple nice improvements over PDF 1.5 (the latest supported version of PDF prior to this version) like an extension of the DeviceN mechanism for defining spot colors, or advanced encryption system, or being able to use PDF files as a container for embedding all sorts of data like 3D models. But the real change you are going to see in Scribus right now is the embedding of OpenType fonts without converting them to TrueType or Type1 fonts. New Content dock Now when you select a frame, this dock will show frame-specific controls. If it’s a text frame, you will see text properties. If it’s an image frame, you will get image properties and so on. The layout of the controls is pretty much the same as back when they were part of the Properties dock. So it’s really wide and obviously needs further redesign. Better on-canvas selection of objects Selection of objects on the canvas has been improved in two ways. First, you get Ctrl+Click to select items below guides. And then you can Alt+Ctrl+Click to cycle through the items in a group. Markdown importing PDF, Adobe’s IDML, Quark’s XTG, and Krita’s KRA importers have all receieved minor improvements. One more thing that you will probably appreciate is a newly added Markdown importer. It is really simplistic, requires manual editing of character and paragraph styles, and supports only basic markup. So no tables or images. But if that’s sufficient, and Markdown is something you already use, this could be helpful. Slight UI changes The new version also comes with better support for dark user interface themes, and you can now switch between icon sets without having to restart the application. So what’s been going on with Scribus really? The team is still targeting to make 1.6 the official stable release. The overall amount of changes in the 1.5 series as compared to what you get in 1.4 is just massive. Here are just a few of them:Qt 5 port and HiPDI supportComplex scripts support (Arabic, Hebrew, Chinese, Hindi etc.)Real table framesOrphans and widows controlFootnotes and end-notes, text variables, cross referencesLots of new importing options including IDML, XTG, PUB, DOCX, XARCIE Lab* and CIE LCH color models supportBut it does feel like Scribus has been falling off the face of the Earth in the past few years.The 1.5 series has been in the works for a very long time. They made 1.5.0 release available in May 2015, but I demoed some of its new features like the Picture Browser as far back as 2009 at an exhibition here in Moscow.If you look at source code changes, the people working on the project are pretty much the same people who have been active with Scribus for the past 15 years: Craig Bradney, Jean Ghali, Alessandro Rimoldi. But the project’s founder Franz Schmid left for family reasons in 2016, and there are no new people to make up for that departure.I think part of the reason is that, lately, the publishing industry is not at its best. Unless, of course we are talking about organized crime we know as textbooks publishing. And when an industry is not doing well, there’s less interest in making software for it.And then again desktop publishing is just not as fancy as, let’s say, 3D and visual effects. So there’s less interest from computer science students. Scribus did participate at Google Summer of Code in the past. In fact, new table frames in the 1.5 series are one of the results. But it’s been a while since Scribus last participated, and none of the past students sticked with the project.I know I’m probably making it sound a little depressing here, that is really not my intention. I’ve been following Scribus since 2002 when I landed my first and, by now, probably my last fulltime job in the publishing industry. I do have a special place for the project in my cold, cold heart. I’m just worried.Even with money flowing out of advertising on paper and right into online advertising, Scribus continues to be an indispensable tool for desktop publishing when you can’t go for a proprietary solution. I would very much love to see the project find a way to get new people in.For now, let’s hope that Scribus 1.6 is not too far away from now.
Now that the GIMP team is back to releasing both stable and unstable updates, here is the question that a lot of people will be asking. Should they go for 2.99.2? Or should they stick to the stable 2.10 series? What can you realistically expect from this particular release of GIMP and upcoming GIMP v3.0 in general? Here is what I think you need to know to draw your own conclusion.This article is a transcript of the video below. Here is my obligatory disclaimer that I’m affiliated with the GIMP project, so you should take anything I say about it with a grain of salt.GIMP 2.99.2 has been released and it actually comes with changes that are not available in the stable series which is 2.10. It also comes with a few regressions. So let’s talk about major differences. Does it look better? For some reason, there’s a popular opinion that GTK3, the user interface toolkit, is going to right many wrongs for GIMP in terms of user interface. This is partially true and partially false.The new unstable series is based on a newer version of the user interface toolkit called GTK. This brings a variety of changes. The most important difference is that the unstable version of GIMP handles HiDPI displays vastly better because the support is built right into the user interface toolkit and thus there is no need to add ugly hacks. Simply put, if you have a 4K display, you should be fine now. If you have a FullHD laptop and an external HiDPI display and you use GIMP on both of them, you should be fine as well. Toolbox icons are not tiny, brush previews are not tiny and so on. Although, right now all I have here is a smaller HiDPI display, 2560 by 1440 pixels, and everything looks kinda huge to me, even with fractional scaling.I also don’t really like numeric controls in GTK3. This will be evident, for example, in the updated slider widget where you get minus and plus buttons right next to the slider. It’s probably okay for simple desktop applications but it’s just horrible for cases like GIMP where these buttons occupy the space that would be otherwise used for showing more of the actual content. Users with touch displays might disagree with me though.To give you more idea, here is the window of the Convolution Matrix filter from the unstable branch and here it is from the stable branch based on GTK2. With GTK3-based version, you simply don’t see as much of the content that you can preview on the canvas.And if you look at the height of the sliders, it’s really not as good as what you know from the stable series on a HiDPI display. This was discussed in developers' chat and the agreement seems to be that this can be remedied by introducing a small variation of the user interface theme. So in terms of user interface in this particular unstable release, it’s up to you to decide which one is more important to you: decent HiDPI support in terms of icons on a 4K display or the size of other widgets like sliders and spinboxes. Does it work faster? This is a ‘yes and no’ again. There are no changes to make, for example, filters work faster. This is mostly not on the GIMP’s side anyway, it’s up to the GEGL library which is the image processing engine.However, the unstable series now features render caching. What it does is basically create a bitmap out of everything you see on the display: the projection of all layers, any display filters you might be using, the selection cue, if there is one, and so on. (Features Coffee Run poster by Hjalti Hj lmarsson, CC-BY 4.0.)So when you zoom in on your project and you pan around, GIMP doesn’t have to rebuild all that for each new pixel that appears in the viewport. It just displays this pre-built cache. This basically means you get much snappier navigation which is essential when you work on large projects with a lot of layers.Other than that, do not expect any major performance enhancements just yet. Does it improve the editing workflow? Yes, absolutely. One major change in the unstable series is the long anticipated support for multiple layers selection. If you need to move many layers to a layer group, you want this. If you need to color-tag a bunch of layers, you want this. If you want to assign a mask to multiple layers, you pray to god you’d be able to do that. Is this feature-complete? Not yet. Even though Jehan Pages rewrote about 90% of all code that is related to selecting items, and trust me that’s a huge amount of work there’s still more programming to do. A lot of features are made multiple selection aware but don’t allow changing multiple layers. So they will report they know there’s more than one layer selected, they just can’t do anything about it. Are there any benefits for painting? Indeed yes! Graphic tablets support in GTK3 is vastly better than what you get for software based on GTK2. One major change here is hotplugging of the devices. If you plug a graphic tablet to your computer and you already have GIMP 2.10 running, you have to save all your projects and restart GIMP for the device to become usable. And then, of course, you have to reopen your projects to resume your work. So once you plug e.g. a Wacom tablet into a laptop running the newly released unstable version of GIMP, the relevant devices show up in the Device Status dock immeidately, just like magic! :) So if you always have your tablet on your desk but you move around the house with just the laptop, you are going to appreciate this. Does it improve resources management? Not yet. There has been an important project in the pipeline for a few years now. Jehan Pages is working towards simplified management of all sorts of extensions: brushes, scripts, plugins and so on.What the team has so far is the skeleton of a new dialog called Manage Extensions and support for a new file format called GEX, or GIMP EXtensions, that contains any kind of additional resources that can be installed. One of the important missing bits there is an online backend to store extensions and allow searching for them from within GIMP and then installing them.So the right answer to the question about better resources management would be ‘eventually’. Any color management improvements? Yes, there are several major changes in this series that change how color management works.The idea is that you can either work with images in their native color spaces (that is, taking into consideration primaries and transfer functions) or enforce sRGB which is kept for the sake of compatibility with old workflows but also because of the sheer ubiquity of sRGB. This work is not complete yet, there’s more to be done. Now, the release notes mentions something called space invasion. I think this deserves a more verbose explanation, but for now, here is the general idea.As you probably know, GIMP now uses a non-destructive image processing engine called GEGL but doesn’t yet expose any non-destructive features. In a non-destructive context, each modification, whether it’s cropping or a filter, is preserved as a node in a graph and can be altered at any time later. You just don’t get to see nodes or access them directly yet.The layers dialog is a sort of familiar representation of some nodes in GIMP, and the on-canvas preview of most filters is provided by a temporary hidden node until it gets merged down into the edited layer. But that’s about it for now. So, about that space invasion thing. Supposing you opened an image that is tagged with AdobeRGB color profile. Then you applied a bunch of filters to it, and some of those filters use LAB color model rather than RGB. So the data will be converted back and forth between different representations, also called pixel formats, quite a few times. But you do need your final image to still retain that AdobeRGB color space attribution.GIMP now uses GEGL’s ability to send the information about the original color space across the whole node composition tree. Whether a filter works on RGB or LAB or HSV representation of data, it will also pass the color space information to the next node in the tree so that this information isn’t lost and GIMP can make sense of the data that the last filter in the chain generates. One of the things that apparently are on the TODO list for version 3 is fixing the magenta mess in the color selection dialog. GIMP still thinks all colors are in the sRGB color space, in fact that’s still assumed whenever you load GIMP color swatches. So whenever you dial a color that’s out of gamut, whole sections of color sliders get the magenta fill. Apparently, at least one of the active developers is willing to address that. It is highly unlikely that space invasion will end up in the stable 2.10 series. So if you do work on RGB images that have color profiles other than sRGB and you do some seriously heavy editing, version 2.99.2 might work better for you in that regard. The usual disclaimer about unstable releases still applies. Are there more features in general? Not many. There are some minor changes as compared to the stable 2.10 series. For example, the plugin to support HEIF and AVIF file formats has more features like selecting color subsampling/pixel format and encoder speed. But that’s as far as changes go so far. So, not much to look forward to. The plan to backport new features to the stable branch has made the stable branch a lot of fun but it also made using the unstable series considerably more boring. Do you lose any features you know from earlier releases? Sort of. Several old Python plugins have been moved to the gimp-data-extras package. So you can still have them, but you will also have to update the code to match API changes in the new series.Speaking of which, all 3rd party plugins will have to be updated. So if you rely on something like BIMP or Resynthesizer in your workflow, you really should wait for their respective updates. Or take matters into your own hands, if you feel like doing so. Does it finally support CMYK or non-destructive editing? No, as you might have already surmised, while the image processing engine is capable of both these things, GIMP does not yet provide any user interface to that. Both features are planned for future releases and can already be worked on, but the existing team is stretched thin so you will either have to wait or begin working on that yourself. Are there any benefits for plugin developers? The answer is ‘lots’! GIMP now relies on technology called GObject Introspection to access GIMP’s application programming interface. This made it cheap to add more languages you can write new plugins on. I’m talking about JavaScript, Lua, Vala, and the word on the street is that Rust might soon join the band. Some example code is available to study. The other fun thing is that you can now directly access GEGL in your 3rd party plugins. In fact, there are a few such plugins written in Python already available on GitHub. Again, a great source of information for people interested in doing their own plugins.There are more changes worth mentioning to developers and they are nicely covered in the official release notes. What’s next? It’s most likely that, for the rest of 2020 and at least the first half of 2021, the team will continue releasing both stable and unstable new versions of GIMP. At some point, perhaps next year, they will drop updating the 2.10 series and give their full attention to releasing version 3.0.And in conclusion, I know for a fact that there will always be people who will be like, f— it, I’m switching to the unstable version anyway and you can’t talk me out of it. My only plea to you is that, rather than complaining on some bizarre forum no GIMP developer knows about, you would send sensible bug reports and do your best to test patches for the bugs that you discovered.I think that pretty much covers it! :)Artist credits:Coffee Run poster, CC-BY 4.0, Hjalti Hj lmarssonSpear of Jealousy, Fredrik Persson aka Sevenix
The Cinelerra community is mourning the loss of W.P. Morrow aka Good Guy who passed away at the age of 66 after being hit by a truck while cycling near his home in Longmont, Colorado, last week.Bill was lead developer of Cinelerra-GG video editor since 2016, and prior to that, he worked with Michael Collins on Cinelerra-HV.I never really had the pleasure to talk to him. So I think it’s best to pass the mic to people who did know him.Igor Vladimirskiy:GG was the user-friendliest guy among Cinelerra-* developers. He had a lot of spare time, he was able to figure out complicated source code, and, what’s really important, he was really good at understanding and programming digital content creation software. There was nothing he couldn’t do. And he could explain it all in layman terms to users.Sam:He was never too comfortable to tackle even small problems and improvements to make the life of the users easier. You could tell that he enjoyed programming a lot. He enjoyed improving Cinelerra-GG and interacting with the community. He has acted selflessly and for the good of all, and through his commitment to free and open software, he has made this world a bit freer and better.MatN:I am devastated by this news. My condolences to Phyllis and family, I wish her strength in dealing with this loss. GG will be much missed, his support and dedication have set an example. And I am grateful for all he did to Cinelerra. He and Phyllis published over 50 monthly releases since 2016. He was always willing to explain how things worked.Haldun Altan:Dear Phyllis, please receive my love and wishes for courage, patience. And please imagine in these sad days this community communing with you in their thoughts and minds, grateful to a Guy who did Good to this world. Love and deepest condolences.Is there future for this project with Bill gone? There could be. Here is what Sam had to say:Phyllis has expressed the wish to resume work on the project at a later date, currently she needs some time off, and to participate in the documentation and correspondence as usual. I will continue to support this project as before. The monthly releases cannot be offered in the same way at the moment. Minor changes and improvements will take place from time to time. We are open for new developers and hope for your support.
Week highlights: Blender Conference was online this year, new releases of BlenderBIM, Pitivi, Shotcut, Qtractor, new free/libre VST3 plugins for Linux, new features in Inkscape, Krita, FreeCAD, Dust3D. Graphics The GIMP team was mostly busy with all things related to releasing and packaging GIMP 2.99.2, the first version based on GTK3. The announcement is coming soonish. Meanwhile, Jehan and Akkana started writing a guide on porting older GIMP plug-ins to the GIMP 3.0 APIs.Krita developers have been busy too. The updated Recorder plug-in has been merged to the main development branch and will be available in Krita 5.0. See this thread for details. Dmitrii Utkin is not stopping at that though, there’s another enhancement coming: snapshots management. And Boudewijn Rempt will probably soon merge his patch adding tool presets. A small new feature just got completed: tool option presets. Tools now remember their settings, and you can save those settings with a name, and restore them with a single click. pic.twitter.com/s9e6nuMH9q— Krita Painting App (@Krita_Painting) October 31, 2020The last two GSoC projects this year, Storyboard and MyPaint Engine, are now mostly ready for merging too, some minor work might still happen in respective branches.Scribus developers recently sent the call for translators to update .ts files for upcoming v1.5.6. There’s no date set for the release that I know of, but usually it’s soon enough after the deadline for translation updates submission (which was Nov 1).Inkscape got a new display mode called Outline Overlay. It basically dims all fills by making them semi-transparent and then renders all strokes at once with the same stroke width, even for objects that have no stroke. Another cool new feature is the Slice live path effect. Basically, you can divide a shape into two parts and edit each half’s fill/stroke separately, then drag the separation line wherever you like. It’s rather unstable at the moment though, e.g. Inkscape crashes when you try to edit the original path. 3D The annual Blender Conference was online this year, for reasons too obvious to mention. You can watch this video to see all the short talks: Some really exciting things going on with Dust3D: Replaced old motion editor with the new simple motion generator#Dust3D #ProceduralAnimation pic.twitter.com/Vg8BYibA5p— Jeremy HU (@jeremyhu2016) November 1, 2020 CAD Yorik van Havre posted a long overview of everything he did over the summer while working on FreeCAD. Here is a quick list:Materials now have a Section Color property, for when the material is viewed and when it is cut through.A new preferences setting is available for default hatch pattern scale, this will be useful when you do a lot of floor/building plans at the same scale.Section planes now have a Show Label property to show the section plane label in the 3D view to differentiate between many planes.You can now import reference images like floor plans, position them on a common reference point, and scale them to the desired size.The FreeCAD exporter for Blender for a few minor updates. Meanwhile, Dion Moult made another BlenderBIM release. Lots of good things, but if I was to mention just one, it would be a set of basic nodes for Sverchok to read, write, and manipulate IFC data. Video The Pitivi team skipped v1.0 release altogether, switched to YEAR.MONTH scheme and released v2020.09 with everything they have had in the pipeline since 2017, and that’s a lot. Some of the highlights:plugin systemsupport for custom UI for effects (e.g. for lift-gamma-gain)timeline markersnested timelinesredesigned library of effects and improved workflow for adding thembetter UI for effects that allows changing settings for multiple clips at oncenew render dialogcomposition guidelines in the viewersafe area visualizationSee release notes for details. You can try installing from flathub but in my experience, that build is messy: it doesn’t follow custom scaling on HiDPI and doesn’t honor the preferences switch for a dark theme. On the other hand, recently released Ubuntu 20.10 already has the update (20.04 LTS does not).Shotcut 20.10.31 was released. Dan nuked the rest of HTML5 based filters and started providing conversion infrastructure, e.g. a Text: HTML to Text: Rich converter. There’s also a new filter for audio polarity inversion and a bunch of small usability fixes. See full release notes for details.Kdenlive developers have been busy fixing bugs and adding small new features like multiple tracks deletion and being able to apply bin tag color to timeline clips. Music-making There have been two major VST3 releases for Linux lately.SuperAmp is a new open source (Apache License 2.0) WaveNet-based guitar amp simulator. people report that they get ~20% CPU load in Reaper on Windows, but I’m getting 100% CPU load on Linux for the VST3 (haven’t tried the standalone version yet). So at least this version is probably OK for offline re-amping but maybe not for live playing. Grab it here to give it a spin. Odin 2 is finally available as a native VST3 for Linux. This is a really massive release, so much so that Unfa already shot a looooong video about it. The installation is a little messed up: you need to copy presets folder to a hardwired folder in /opt. But once you get past this minor annoyance, are you treated with a really, really good synth. Qtractor 0.9.18 is out with mostly small usability improvements and bug fixes. No big stuff but well worth upgrading. The only thing I wonder is when Rui finally decides it’s time to call this a v1.0. Of course, there’s always the Pitivi course of action :)Unofficially, Ardour 6.4 (that is coming really soon now, it seems) is likely to be one of the last releases in the 6.x series. Paul has had a lot of progress in the nutempo2 branch, and that’s the single most important feature in upcoming v7. In short, this is going to fix a bunch of annoyances when dealing with both MIDI and audio data thanks to rewriting much of Ardour around the concept of a superclock that Paul nicely explained in this 2018 interview.There’s more interesting stuff in the pipeline but I’ve learned my lesson with regards to Ardour: let them polish and merge it to the main development branch first! Tutorials A new GIMP tutorial by Zakey Design on creating a coffee poster design: DIY3DTECH.com explains how to use Inkscape to create a map for laser engraving: Wonder Woman painting with Krita timelapse by someone who clearly shouldn’t be calling themselves lazy: RoseRedTiger did a tutorial on sculpting a bat girl with Blender 2.9, and she used an unstable build of the program too! Artworks Brellias posted a post-apocalyptic scene made entirely with Blender: Tansy Branscombe published a triptych painted with Krita. There’s now an artist feature with her on Krita.org. User pjv at Krita Artists posted a great test of charcoal brushes: William Nugroho posted a commission where a sketch done in Photoshop and the final version made entirely with Inkscape:
There have been just so many news since my last week recap. Unfortunately, these days, I have to prioritize for my family. So this is going to be an update that covers both recent and not so recent events. Meanwhile, I’d rather skip the Blender 2.90 release because it’s been covered so extensively, and v2.91 is in beta now anyway. Graphics Krita 4.4.0 is finally out. Highlights:Multiple fill layer updates: multi-threading, screentone, multigrid, and SeExpr optionsDiagonal selection lines in MyPaint color selectorGradients can now use current foreground and background colorsHere is the full list of changes. And here is a video: Quite coincidentally, GIMP 2.10.22 became a file formats update release. It wasn’t planned, it just happened.Improved HEIC support and newly added AVIF supportQuite a few improvements in the semi-forgotten PSP file format pluginImproved multi-layer TIFF exportingExif’s Orientation meta tag is now dropped to avoid extra rotationThe GEGL operation tool now has a Sample merged option for when you need to pick a color from the canvasMyPaint got several nice updates in the past several months. If you’ve been missing numeric input for values, you can now double-click on a slider to get this: You can also use the Shift modifier to enable precision mode where the same amount of force will change the value in smaller increments.Frame dimensions can now be edited in the sidebar rather than in the modal dialog: One new feature added back in July that I haven’t figured out yet is brush resizing by dragging on the canvas. The commit message is unspecific, and all modifiers I’ve tried don’t do anything like that. Let me know if you manage to make it work!Something I did not expect happening was me taking a rare spare evening to go through patches and bug reports for Fontmatrix, applying as many fixes as possible, and tagging the first release in 11 years.It comes with a basic Qt5 port, an update to match Unicode 13.0 coverage, a bunch of UI changes made by Pierre Marchand around 2011, and all sorts of fixes contributed by wonderful people who stopped by long enough to help with this or that. The sad fact is that there is no real developer working on Fontmatrix today. I was never involved with the project beyond fixing some UI stuff, writing docs, and translating the program into my native language. The only reason I tagged the release is that people actually wanted to access the latest stuff available while their distros were only shipping version 0.6.0 from years and years ago.When Pierre left the project, there was a lot of unfinished stuff there. Ideas he started playing with but never completed. New features that never stabilized. Loose ends all around. I mean, it even ships with a local copy of harfbuzz from 13 years ago. And you really don’t want looking at some parts of the UI on a HiDPI display. Still, I believe this had to be done.Michael Murphy (of System76 fame) released FontFinder 2.0.0, an application for browsing and installing fonts from Google’s typefaces directory. There aren’t major new feature in this release, contrary to what you’d expect from a major update. Still, if you use it, you might want updating. Photography Perhaps, the most visible change so far is the complete reorganization of effects in the darkroom mode. It now defaults to four groups: active, technical, grading, and effects. The really huge difference is that you can now reorganize effects however you like, make custom groups, assign icons to them etc. And you can switch between this kind of organization presets. A few more things left to mention. Ralf Brown added a new implementation of nonlocal-means denoising. Hanno Schwalm added basic monochrome image workflow. And another thing you can see on the first screenshot above: Hubert Kowalski says it’s no big deal, but people have been waiting for colored white balance sliders in darktable since forever, and they are now getting both that and a few other improvements. See the PR thread for technical details if you want to know the background.Alberto Griggio released ART 1.5.1 with new demosaic methods, better RAW histogram, and support for Canon EOS R5 and R6.One fun little project I had some time to get to know recently is Filmulator, a program for developing raw photos. It’s kind of intentionally stripped down to essential features. You don’t get things like sharpening or a blemish removal brush but you do get some interesting controls over tonal reproduction. Here is the list of changes. 3D Blender 2.91 is now in beta. Here is a nice preview: Meanwhile, Blender Foundation is now official Associate Member of The Khronos Group. Here is how Ton explains what that means for the project: The benefits are very practical; Blender uses open standards, these standards are being designed and upgraded by Khronos groups. We now get information early and can help shaping new versions of standards. Examples: OpenGL, OpenCL, OpenXR, Vulkan, Gltf, Collada.— Ton Roosendaal (@tonroosendaal) October 21, 2020 CAD There have been two BlenderBIM releases lately, Dion Moult is really on fire! I find it hard to pick even a few changes out of dozens and dozens (and dozens), help yourself :)But that’s not all. Ladybug tools for Blender is now a work-in-the-progress project. It’s easy to see why Dion Moult is so hell-bent on turning Blender into a world-class BIM tool. Blender is a just a spectacular platform for building complex things.QCAD 3.25 is out. The property editor now displays total area of mixed selection of polylines, arcs and circles, and the PDF exporter can now generate PDF/A-1B files. As usual, there are more changes in the non-free professional edition.There’s a new workbench called Mechatronic available for FreeCAD, It comes with a library of parameterizable mechatronic components like a shaft holder, a breadboard, a linear guide etc. See here for more info. GitHub user ppaawweeuu updated an older Blender add-on by Dealga McArdle for importing CityGML files (3D city models). The benefits are very practical; Blender uses open standards, these standards are being designed and upgraded by Khronos groups. We now get information early and can help shaping new versions of standards. Examples: OpenGL, OpenCL, OpenXR, Vulkan, Gltf, Collada.— Ton Roosendaal (@tonroosendaal) October 21, 2020LibrePCB 0.1.5 is out with major improvements like copy-pasting finally working in schematic and board editors. See this blog post for details.KDE.news recently ran a great interview with Gina H u ge, principal developer of OctoPrint. If you are into 3D printing, you should definitely check it out. Video Shotcut 20.09.13 is out featuring new Blur: Pad and Text: Rich filters. It also comes with new default layoots and and easy switcher between those, following a similar change in Kdenlive recently.MattKC posted a long, detailed update about the state of affairs with the big rewrite of Olive. I think it’s vastly more important than all the recent changes in the git repo (save for newly added OTIO importing maybe). TL;DR: rewriting is overwhelming, the combination of changes has probably never been tried before by anyone, the progress is OK.Kdenlive 20.08.2 was recently released with a little more than bug fixes. Automatic scene split made its comeback and the Crop By Padding filter was added. See here for the full log of changes. Music-making While Paul Davis is focused on the nutempo2 branch in Ardour (to allow freely switching between editing in musical time and samples domain), Lucian Iam resumed his work on the WebSockets based surface for controlling Ardour from a web browser.Meanwhile, Ayan Shafqat added code for ARM NEON optimized routines, same as the previous time with AVX instructions for finding and computing peaks, applying gain to a buffer, and mixing buffer with and without gain. All of that is now available in Ardour 6.3 along with other changes.One really huge new feature that is making it to the next release (v6.4, presumably) is the much anticipated VST3 support. I’ve tested that with recent release of Surge and Dexed, and it works like a charm. You get MIDI in and audio out. You get automation for controls etc. It seems impossible to see the list of built-in presets in the host’s own drop-down list. But let’s be frank: huge synths tend to have categories for that, and a flat list just wouldn’t cut it anyway. New in development: initial support for VST3 plugins now available in the main dev branch pic.twitter.com/75RR23bd1R— Ardour DAW (@ardourdaw) September 28, 2020There have been several new releases of Zrythm since my last recap. Some of the highlights are:Stem exportingFree drawing in velocity editor and automation editorVST3 support enabled on Linux (rejoyce ye Surge users)DSSI and LADSPA plugin supportSupport for VST and LADSPA shells (libraries that contain multiple plugins)Dragging MIDI and audio files directly into the timelineModulator track now available, modulators can be connected to controlsRegions can be merged nowChord padsAdd hardware processor for controlling input hardware ports (can now record with RtAudio/RtMidi)Write to zrythm.log under the system s temporary dir until the actual log file is initializedAdd framework for MIDI functions (such as legato)See this full list of changes so far.The sfizz 0.5.0 sampler/synth was recently released. Some of the major changes are support for Flex EG opcodes by ARIA and LFOs as modulation sources, an actual custom GUI (built with DPF), support for reverb, gate, distortion, and compressor effects, and more. See here for more.In an unbelievable case you missed a PipeWire development update, here is where you can find it. Newsworthy quote:As Wim reported yesterday things are coming together with both the PulseAudio, Jack and ALSA backends being usable if not 100% feature complete yet. Tutorials Luciano Mu oz posted an introduction to Grease Pencil in Blender: New Inkscape tutorial by Nick Saporito: Adam Cox explains using QGIS and GIMP for enhancing historic maps: Ryndon Ricks (TJ FREE), SpongeBob tracing and shading wiht Inkscape and Krita: Unfa on making voiceovers with Ardour 6 and free LV2 plugins: Artworks MrCoffee Time, Tom Bombadil, made with Krita: Arga Raditsya B, building through knowledge, books, pen and ink, made with Krita: Another valley painting by Philipp Urlich, made with Krita. You can watch (more like listen to) a recent interview with Philipp on Twitch by the way.
Weekly highlights: new releases of Siril, Kdenlive, and Zrythm, EEVEE is getting more sky models, LibreDWG gets v0.11 release and goes beta, Inkscape’s GSoC students are having great progress. Graphics Dozen of bug fixes landed to Krita last week. Notable changes are further work on SeExpr support in fill layers (e.g. previewing before addition), as well as a few RAW import plugin improvements.The Inkscape team appears to be working towards releasing v1.0.1 in September. There won’t be any new features, only bug fixes and translation updates.Developers recently had a several days long hackfest, and one of the days was devoted to Google Summer of Code. You can watch the video, and here’s a quick overview:Link Mauve has some good progress with GPU-side rendering via Pathfinder. There aren’t comparison tests yet (wait for feature parity first) but the general outcome is that the more complex your drawing is, the more benefit you will get from rendering it on the GPU.Moazin Khatri did some initial work on porting Inkscape paths modifying code out of the aging livarot library.Valentin Ionita also has great progress with rewriting Inkscape’s docking system that now looks more like what you get in GIMP.Abhay Raj Singh implemented the command palette (it doesn’t seem to fully cover available commands yet, e.g. you can’t set a clip or use the Union boolean op on paths) and is now working on the macros recording. There’s more news from Akira developers: Working on making Artboards feature complete by implementing:- Background colour- Visibility toggle- Lock toggle- 2 way hover effect pic.twitter.com/Kb18769aNi— AkiraUX (@akiraux) August 12, 2020 Photography Siril 0.99.4 is out as a beta of upcoming v1.0. So Cyril and Vincent pretty much spilled the beans for the big release :)There’s a pretty impressive list of changes there, including rewritten user interface (in a single window now), 32-bit per channel precision support, and more. Builds for multiple operating systems are available (AppImage has not been updated yet but Flathub already has the new version).It looks like ART (the RawTherapee fork) will soon have a new release with some new features (Alberto hasn’t told much more than that yet). Animation If you care about Synfig and, for some reason, don’t read their weekly updates, you absolutely should :)Videos can now be exported with sound when a sound layer is availableThere’s a Stop button in the rendering progress window nowGreat improvements in the Lottie exporter and the new Skeleton tool (GSoC project this year)See here for more info. 3D and VFX Blender 2.90 will be coming with Preetham and Hosek/Wilkie sky models in the Sky Texture node (Hosek/Wilkie comes with ground albedo control). I’m probably not going to surprise you with more great new stuff coming from Pablo Dobarro. The Pose and Boundary brushes now support deforming the mesh using cloth simulations. This allows creating bend and compression folds without using collisions. #b3d #devfund https://t.co/MxDm5rFjTg pic.twitter.com/WXG54YGXXA— Pablo Dobarro (@pablodp606) August 14, 2020Meanwhile, Manuel Castilla is working on an experimental Blender branch to improve the performance of the compositor. CAD LibreDWG 0.11 was recently released. The library is now officially beta quality project.The new version comes with a ton of improvements and new features (it was ever so, really). Here are some of them:support for writing DWG r2004+ files, including r2010, r2013, r2018, but not r2007, more work to be done there;support for reading material properties and revisionguid fields in 3DSOLIDsupport for GeoJSON (RFC7946)new dwgfilter utility to use custom jq queriesnew dxfwrite utilitysupport for many object/entity types (support for others is now considered stable).On Reddit, Reini Urban pointed out a few things that needs to be improved for LibreDWG to go out of beta:support for writing files in more versions of DWGscripting bindings have to be be improved via dynapi rather than SWIG (Gambas support planned btw)There’s some work going on to add support for LibreDWG to FreeCAD. it doesn’t work on Linux yet though (at least, if we are talking about AppImage builds). Video Kdenlive 20.08 is out with quite a few improvements and new features including (but not limited to) the following changes:Named workspaces to quickly switch between different sets of dockable dialogs depending on the task at hand (culling, editing, color-grading etc.)Multiple audio stream support (audio routing and channel mapping coming in 20.11 or later)Scrollbars in effects and the clip monitors have zoom controls on the edges now (the next release will probably have those in the timeline as well)Management over the size of cache and proxy files now availableOlive got Google Crashpad integration on Windows to simplify collecting crash reports. Support for macOS and Linux should come at a later time. Music-making Zrythm 0.8.797 was released with smaller improvements like silence insertion, volume control for metronome, and punch in/out recording.Ayan Shafqat contributed AVX optimized routines to Ardour for things like finding and computing peaks, applying gain to a buffer, and mixing buffer with and without gain. Tutorials New Blender tutorial on CG Cookie: Urban Low-poly Building Luciano Munoz released a tutorial on the basic of using Grease Pencil for 2D animation. This is an unusual Inkscape tutorial. This guy basically uses the application to design graffiti for model railways. Nice if slightly creepy drip portrait effect in GIMP, tutorial by Davies Media Design: Art Cola Corridor by arbinsidik, Blender 2.8 and Cycles. Olivier Pautot published the second render (made with Blender) in the Crowds series. Alan Amaya, I promise, painted with Krita. Each of my weekly recaps involves researching, building and testing software, reporting bugs, talking to developers, actually watching videos that I recommend, and only then writing. Time-wise, that's between 10 and 20 hours. If you enjoy the work I do, you can support me on Patreon or make a one-time donation.
Week highlights: new releases of Blender, MusE, Zrythm, MuseScore, new release of darktable shaping up, lots of improvements across multiple major projects like FreeCAD, LibreCAD, BlenderBIM, Siril. Graphics Jehan Pages moved OpenCL support in GIMP back to the experimental department. The reason for that is simple: there is no one to work on OpenCL support right now, not all drivers treat OpenCL equally well, and the amount of visual garbage in the output is just too annoying when you start using e.g. GEGL-based filters. Other than that, work continues on polishing multi-layer selection, the latest change there being the support for merging down multiple selected layers.Most of the changes in Krita’s main development branch since releasing v4.3.0 are bug fixes. The one exception worth mentioning here is a patch by Peter Schatz that allows using RGBA brushtips as gradient maps, which means multi-color brushes among other things.All four Krita’s GSoC students seem to be doing extremely well. All of them have blogs to follow: Leonardo Segovia (dynamic fill layers using Disney’s SeExpr), Saurabh Kumar (storyboarding), Ashwin Dhakaita (MyPaint brush library integration), Sharaf Zaman (SVG mesh gradients support). Photography The darktable team started updating release notes for upcoming version 3.2 expected some time in August. You can have a peek. Most changes last week were support for new cameras, translation updates, and bug fixes.Siril is now able to remove square pattern commonly visible on Fujifilm files (made via X-Trans sensor). There’s a checkbox for that on the Preprocessing tab of the sidebar, and there are some settings to dial in in the Preferences dialog. 3D and VFX Blender Foundation released v2.83.2 that belongs to their LTS (long-time support) release series. If you missed this story entirely, the idea is that studios that rely on Blender in production should be able to have all the recent bug fixes without all the new fancy bugs caused by new features, refactoring etc. So the foundation now basically supports one series for two years shipping bugfix updates and nothing else. Pretty much the way GIMP used to have both stable and unstable branches, save for the 4-to-6 development cycle in the past :)They also have quite a few GSoC projects this year (again). I haven’t read all the reports yet, but if you are really, really interested, each project has a dedicated DevTalk thread.Meanwhile in the sculpting land Sculpt: Boundary BrushThis new brush is designed to edit cloth and hard surface assets. It detects and deforms the mesh using its boundary edge loops. The initial version includes bend, expand, inflate, grab and twist deformation modes. #b3d #devfund https://t.co/vHKSG9IOUI pic.twitter.com/TRvOK1Rw7J— Pablo Dobarro (@pablodp606) July 20, 2020Jeremy Hu has some great stuff for Dust3D in the pipeline: Report the progress on implementing <Interactively Controlled Quad Remeshing of High Resolution 3D Models>, this paper is a patch to fix the two main issues of MIQ method: running speed and edge flow. (1/5) https://t.co/pw2wsXAnFi pic.twitter.com/V2DVpQ8hoj— Jeremy HU (@jeremyhu2016) July 12, 2020You might also like checking out Chordata campaign on Kickstarter. The project calls itself an open-source motion capture system. If you look at their project on Gitlab, they already uploaded PCBs (designed with KiCad, no less) and a Blender add-on to receive, record, and retransmit physical motion capture data. Oh, and we need to talk about Natron again. There’s a new discussion on Facebook where Ole-Andr Rodlie, one of the contributors, basically said this:Natron 3.0 will probably not happen, the code is buggy, unfinished and undocumented At this point Fr d ric is probably the only person that could continue the work, and he does not have the time. Some features could be backported to 2.x At this time we have have enough open issues for 2.x. I would rather get that stable. 3.0 will require new developers to join the project. CAD I wish I could show you screenshots of LibreCAD v3 GUI rewrite that Akhil Nair has been working on as his GSoC 2020 project. But the master branch stopped building for me a while ago.Nevertheless, there seem to be a lot of exciting changes. You can read more about that in a recent report (that one has a couple of screenshots).FreeCAD keeps getting its fair share of changes in the FEM workbench (they also now have a GSoC students contributing there), and the Arch/BIM workbenches are being actively worked on too. I wholeheartedly recommend reading Yorik’s excellent monthly reports. Here is the gist of the latest one.Clones will probably be gradually phased out by App Links. So far, Yorik’s experiments are encouraging.Multicore IFC importer thanks to recent changes in IfcOpenShell. Yorik says that “a fairly large, 50Mb IFC file opens in a couple of minutes” now.You can now optionally include structural analysis data within IFC files when exporting. Further reuse of the FEM workbench is possible for doing things like setting node restrictions and load cases.It’s been almost a month since the last BlenderBIM release which means we are probably about to get an update. The latest one from June 21 allows seeing full hierarchy of materials and styles, support for IFC4 georeferencing data on importing, rebar importing improvements, surface styles support for native geometry, and more. See the post by Dion Moult for a vastly larger list of changes. Music-making Although I’ve been following Zrythm since winter ‘19, I haven’t really posted about this new DAW here yet. My reason for that is multifold. Mostly, it’s because I’ve been waiting for the program to stabilize and it just doesn’t yet.I mean, Unfa specifically launched a counter in a live stream when he was trying to make a tune using Zrythm, and the counter stopped at 7 crashes after two hours. Anyway, I guess, there’s now a historical imperative for me to cover it :)If you want to know why this project was started, have a look at this thread in the project’s forum where the principal developers explains that. For the record, Zrythm actually reuses some code from Ardour but Alex (the principal developer) rewrites it in C which he prefers over C++.So, Zrythm 0.8.694 is out with quite a few new features. Here are just some of them:Routing from chord track to instrument tracksShift+selection for selecting multiple tracks/channelsMake port connections and channel sends undoableMore API coverage for scripts in Guile, the GNU project’s implementation of the Schema scripting languageInterestingly, Alex adopted the very same model from Ardour where he provides fully-functional binary builds of the program for money despite his dislike for paywalls. My educated guess is that this is simply because this model is known to work (although quite a few people will tell you in great many details how much they hate it).MusE 3.1.1 is out with mostly bug fixes over the previous update. One of the new features is support for the MIDNAM extension in LV2 plugins (something Qtractor recently got as well). See release notes for a full list of changes.MuseScore 3.5 release candidate is out with minor improvements and bug fixes. Collectively, MuseScore 3.5 is going to represent a lot of quick wins usability-wise, as you might have already guessed from the past few updates. You can get the gist of it from a much earlier video by Martin “Tantacrul” Keary on changes in v3.3: All three GSoC projects by MuseScore have very nice progress. You can see students’ reports on the community blog.Again, if you missed the news, MuseScore v4 is shaping up to become a massive update. You can read more about that in their blog. Some of the latest changes in the main development branch are the basics of workspace management and a whole new audio engine (that seems to reuse SoLoud audio engine targeted at gamedev, for some reason).Dragonfly Reverb 3.2.0 doesn’t really any exciting changes (unless one less dependency, which is libsamplerate here, is your thing) but Unfa made a point of showcasing this effect, so why not mention it as well? Tutorials Learn how to use math input in GIMP to make simple calculations right in the numeric input widgets. Pixls user ‘scribbleed’ posted a tutorial on animating photos with G’MIC and GIMP. Great speedpainting with Krita, by Denis Godyna. My only concern here is that the timelapse is a little too fast-paced, it’s hard to see the work on simpler objects like those trees at the end. Make a cushion in Blender in 4 minutes, by Andrew Price: Jack Lynch explains his photogrammetry workflow with Meshroom and Blender: How to use Rebar tool in FreeCAD’s Arch workbench: Each of my weekly recaps involves researching, building and testing software, reporting bugs, talking to developers, actually watching videos that I recommend, and only then writing. Time-wise, that’s between 10 and 20 hours. If you enjoy the work I do, you can support me on Patreon or make a one-time donation. Art Fern Hamblin Gadd, a landscape painted with Krita: ‘Forest Temple’ by Maciek Drabik, Blender (both Eevee and Cycles used for rendering): Stefanie Meer, rat sculpt, Zbrush/Blender: Each of my weekly recaps involves researching, building and testing software, reporting bugs, talking to developers, actually watching videos that I recommend, and only then writing. Time-wise, that’s between 10 and 20 hours. If you enjoy the work I do, you can support me on Patreon or make a one-time donation.
One semi-hidden feature in GIMP is that you can do simple calculations in the spinbox widgets where you input numeric values. This tutorials explains how it works and all the advanced things you can do there.So this recent repost on Twitter by the GIMP account (and originally by Andrei Rybak) blew up a little. Clearly, not everybody knew that you could do simple calculations right in the input fields for numbers. But… there’s actually more to tell than what you can see in that short video clip.The text below is full transcript or the original video, with one single exception specifically pointed out later. How math expressions calculations work The general idea is that you should be able to do simple calculations right in the widget where you set some value.So you place the editing cursor, for example, at the end of the current value, then write the arithmetic operation like minus for subtraction, add another value and then press the Tab key to get GIMP to calculate the result. Now, there’s nothing wrong with training yourself to do all the calculations in your head. But if you are not that type of a person, you might find this quite useful.And it is really not a new feature. It was added by Fredrik Alstr mer in 2009 and then released three years later in version 2.8. Blender has had it for a number of years, I’m not even sure for how long exactly. Krita got it in 2016 thanks to Laurent Valentin Jospin. And even Photoshop got that in an update just a year ago, and I do believe it’s the last application in the Creative Suite to get that feature. Well, I guess that’s the corporate planning for you.Some applications go even further. E.g. enve allows writing JavaScript code to make complex expressions, and it’s even animatable But I’m digressing already, so that’s probably a story for another time. Where it works One important thing is that it doesn’t yet work everywhere. As of version 2.10.20, GIMP only supports basic expressions in the widget called spinbox. That’s the generic input field with two arrow buttons pointing up and down. You can find it in dialogs like Scale Image or in the Tool’s Settings dock. It doesn’t yet work in the slider widget which is where you would expect it to work. The team actually discussed this a while ago and is positive about having that. So all it needs is someone to write a patch.It also doesn’t work in spinboxes in plugins, for a purely technical reason that, I believe, can also be remedied. What expressions you can use So what you can do is add: subtract: divide: multiply: raise to Nth power: GIMP even respects the order of operations, so if you go for something like 600-2*5, the answer will be 590. Which means you can abuse GIMP to solve math quizzes online :)You can also use parenthesis if you want to. So this kind of an expression(200*3)*2+(4/2)will yield the correct result which is 1202. Which is so much not a common use case but… what the hell!Something I didn’t know about (so it didn’t make it into the video above) is that you can also use ratio expressions. Ell was kind enough to point that out.Supposing you know the width of the image and you know you want it in 16:9 ratio. Put that width into the ‘Width’ box, then write ‘16:9’ in the ‘Height’ box and press TAB. There, you have it. Mixing units Now if that wasn’t advanced enough for you, how about mixing units? You can do an expression like 10cm+3in, and you will get 17.619cm. I’ll give you a very silly yet somewhat realistic example.Supposing, you’re watching an American crime show, and someone reports a suspect who is 5 feet 4 inches tall. Now, if you are born and raised in the metric system, you are probably damned if you knew how much exactly that is. Well, fire up GIMP, press Ctrl+N for a new image, switch to centimeters, write 5ft + 4in and press Tab. Et voila! Here is how it works. GIMP comes with a rarely used dialog called Units. This is pretty much a reference table for conversions between various length units that GIMP supports. This conversion system uses inches as the base unit. This is likely because resolutions in computer graphics are still commonly measured in inches. Either way, for math expressions, you can use any of the units you can find in that dialog. You just need to reference a unit by its abbreaviated name. You can look it up in the Units dialog here, in this column.You can even create your own unit, like a sea mile or a light year, and then mix it with feet or angstrems, if that’s where you fancies take you. Future work? Personally, I think this feature could be further improved in at least three ways.The first one is adding support for math expressions to the slider widget. The second one is adding some kind of a hint that math expressions are possible in a widget. This would improve the discoverability of this feature. And finally, making more functions available. I’m primarily thinking here of triginometric functions like sine and tangent, some of the other applications like Krita already support that.All three requests have been filed to the bug tracker. We’ll see if anyone is fond enough of the ideas to actually write the code.Hopefully, you can make a slightly better use of GIMP now that you know all this.
Okay, this is a little embarassing. Certain family-related stuff kept me from posting more, so there’s an almost 4 months long gap. Let’s quickly amend some of that. Graphics I think it’s fair to say the Inkscape v1.0 release is probably the greatest thing that happened in these FOSS parts lately. The new version comes with a whole lot of improvements, not updating is just unthinkable to me (I did switch to the main development branch maybe half a year prior to the release). If you are wondering, where this project is heading and whether they have plans to fix some of the important issues or start doing paid development (Blender, people always refer to Blender), I recently interviewed the team in a not entirely smarmy way, you can have a look here.The GIMP team released version 2.10.20 with a bunch of new stuff as well as bug fixes. I covered some of the new features in dedicated videos.Basic non-destructive cropping: Blending options for GEGL-based filters: On-canvas controls for the Vignette filter and new filters, Focus Blur, Variable Blur, and Lens Blur (the last one is not in this video): Tobias Ellinghaus, darktable developer and GIMP contributor, started working on libxcf, a small library for writing GIMP’s project files. The library is already used in darktable for writing XCF files, optionally with masks as regular layers (at least, for now). As such, reading XCF files is a low-priority task.Alessandro Francesconi released BIMP 2.4 for GIMP, the only change is support for OpenEXR. Source code and a Win32 build are available.The Krita team released much anticipated version 4.3.0 with a ton of improvements: snapshots (essentially being able to save different stages of drawing a picture and navigate between those), Android build, improved animation tools, new and improved filters, separate opacity/lightness settings for RGBA brush tips, new Magnetic Selection tool quite similar to Photoshop’s Magnetic Lasso and GIMP’s Scissors Select tool. There were some complications in that development cycle, all (or most) happily resolved by now. The team has several students in the Google Summer of Code program, one of them is working on storyboarding.Albertas Vy niauskas resumed hacking on one of my favourite applications ever, Gpick. There hasn’t been a new release yet, but it looks like there might be one later this year.Birdfont 4.0 is out featuring better spacing classes to simplify adjusting kerning and left/right-side bearing, smaller file size for OTF fonts, and various engine inprovements. Grab it here. In other font news, FreeType 2.10.2 has been released with WOFF2 support. Photography Admittedly, I need to look at recent changes in darktable and RawTherapee in great many more detail. I did some quick testing of Aurelien Pierre’s new color mixer RGB module for darktable, and it’s an interesting if not entirely unexpected way to adjust white balance (among other use cases).I think it’s worth mentioning that Filmulator now has a basic website at long last. Admittedly, I didn’t pay enough attention to this project until recently when I finally grabbed the AppImage to test it on some of my landscape shots. Personally, I’m not entirely at home with its UI but the processing side looks interesting enough.Now that I mentioned Filmulator, it’s hard to avoid saying that vkdt is still up and about. If this is the first time you hear about it, vkdt is a love child of Johannes Hanika and Tobias Ellinghaus, both long-time darktable developers (Jo being project founder, in fact).Essentially it’s an experiment of creating a darktable-like raw processing program based on direct acyclic graphs (think node compositing) and Vulkan API. Everything is rendered on the GPU, and the applicaiton has full-window color mangement in linear Rec. 2020. Video Kdenlive 20.04 was released in April with major changes: rating and color-tagging of clips in the project bin, multicam workflow support, OpenTimelineIO support, improved motion tracking and rotoscoping, and much more. The team also has a GSoC student, Sashmita Raghav, continuing their 2019 work on the subtitler.Shotcut got two updates while I was gone. The most interesting changes are: proxy management, slideshow generator, 360 video filters, wavelet denoiser, lots of bug fixes.There have been so many changes in Olive since my last update that I feel a little lost: how do I even approach picking up where I left? Let’s focus on major changes:EXR proxiesWaveform and Histogram scopesnew nodes like Math and TrigonometrySolid and Text clips now available as nodesmore OCIO-based color management workfunctional exporting (if you stick to H.264)Houdini-like ladder UI for adjusting values Audio and music-making Ardour 6.0 is finally out, and even a quick update (v6.2) is now available with minor improvements and new features.The team says v6.0 was 2.5 years in the making but I would say it’s longer than that. Certain changes were already being worked on before v5.12 was released. Being one of those stubborn guy, I switched to the main development branch before the team officially said it’s a good idea to do so. Personally, I find v6 considerably stable. My only beef is that the program gets sluggish when you ripple-delete a region in a track that contains hundreds of regions.There are quite a few interesting things happening in the background. But what I’ve learned over years is that it’s best to let Ardour devs merge all those changes into the main development branch first and then talk about all that stuff.In late April, I talked to Paul Davis for an hour and a half over Jitsi. This later became two episodes of the Libre Arts podcast my unintentional side-project from last year that I’m taking more seriously now. You can either listen to the podcast or read the edited transcripts here on LGW: Part 1, Part 2.I also missed several releases of Qtractor. The most important changes, in my opinion, are MIDNAM support for LV2 plug-ins (and thus Qtractor joins Ardour) and JACK Transport latency now being taken into account for recording with latency compensation. The former means e.g. that you can see right in the pattern editor which note is responsible for which instrument in a drumkit when you load DrumGizmo LV2.There’s certain excitement around the sfizz project which is an SFZ library and LV2 plugin, originally started by Paul Ferrand. Easy to see why: Artwork Among people using Krita for painting, I can’t think of any other artist rising to fame as fast as Philipp Urlich (and rightfully so).Very nice painting of a parrot by Daroart37, made with GIMP:Lucas Falcao thinks himself he should’ve worked some more on the clothes in this render. Still, very nice work there!Some great Blender work by J.C.: Procedural animated ripples in #Eevee with PLX #b3d pic.twitter.com/Yi4yUw5aTH— J.C. (@Cuboxel) June 30, 2020
Podcast ep. 003 - Paul Davis on fixing big Linux audio issues
This is a the second part of the interview with Paul Davis, lead developer of Ardour, a free/libre digital audio workstation.In the first part, we spoke mostly about all things Ardour. But now it’s time to talk about some big stuff that is relevant to virtually every user of Linux audio applications for music-making. As with the last time, edited transcript is available below.<iframe width="100%" height="300" scrolling="no" frameborder="no" allow="autoplay" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/852460711&color=%23ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false&show_teaser=true&visual=true">Three years ago, at Linux Audio Conference, you were talking about things we got right and wrong in Linux audio. One of the things you said was that the audio system on Linux is too complicated but and I’m probably misquoting here it is now too late and we might have to live with what we’ve got. How much do you think we would have to blow up to rebuild Linux audio in a saner way? Do you think we are really past the point of no return?I don’t remember the precise comment that you’re remembering. But one thing I do remember saying, and this could be it, was when I was involved with the Linux desktop architect group some years ago. There were discussions about what should we be doing about audio for the year of the Linux desktop or whatever.And I was pointing a lot of people at PulseAudio. Because even though I was doing JACK at the time, I felt like PulseAudio was just a better choice for consumers. It just did more what they needed.PulseAudio is now the default sound server, that’s how we do things.I think I suspect that it is too late. I think, if you go back and look at what Apple did when they introduced what was then OSX but later got renamed back to macOS, they completely changed the entire audio API.Nobody’s existing software could be recompiled. There were no glue layers, there was no compatibility. The message to developers was just “Sorry guys, you have to reimplement this”. And Apple had enough confidence that developers who were doing that kind of software would want to do that, that they just said “This is how it is”.But Linux doesn’t work like the Apple platform. There’s no big stick. And even if Linus was of the opinion “No, now we are just going to throw out and just replace the whole audio stack completely” Even that isn’t enough to make that happen.And even if somebody did that, somebody would come along and say “Here is a compatibility library for ALSA”. Because people did that with OSS which was replaced, and this library still exists.That’s true even for PulseAudio at the wish of its original creator. Nobody was even supposed to use the PulseAudio library, you were supposed to use ALSA, because PulseAudio would pretend to be an ALSA device.So I don’t think we have a big enough stick to come along and just say “Okay, we learned a lot but we got it wrong and we are just gonna do it over”.Now, the obvious question would be, well, maybe is there some incremental way, maybe it’s not throwing it all away and maybe it’s building more incrementally?PipeWire is an interesting project. I’m very glad that its primary developer has changed his direction a little bit in the last 6 months. And I think he’s taken into account a lot of good criticism from Robin and from some other people who are still quite involved.So you kind of influenced him?Yeah.And I think that there is now a greater chance the PipeWire will be able to do what its goal is stated as. Which is to become a replacement for both PulseAudio and JACK. And if that happens, if it really is able to satisfy those goals, I think the situation is going to get a lot cleaner from some perspectives.There will only be one sound server. It will do low-latency professional audio music creation stuff. It will handle output from your browser. It will handle desktop beeps and notifications.However, that still leaves the question of what APIs the software would be using. And that continues to be part of the complexity.Now, it’s not the complexity that is unique to Linux. If you are doing audio software on Windows, and you say “Well, what audio API should I use?”, it’s a arguably at least as bad, maybe worse than on Linux. It’s just a multitude of possible answers.So the only OS that I’ve seen that has gotten it right is macOS. I would say they got nearly everything right. And the things they didn’t get right are so small that it doesn’t really matter a whole lot.There is one API, it makes no difference whether you are trying to make a desktop beep or you’re trying to write a digital audio workstation. It just works.And if you are a user, and you want a glue to devices to make a combined device, it’s a few mouse clicks. It’s a great design.PipeWire might get us there as far as user experience, if it meets the technical goals that it laid out for itself. But I think it’s still going to leave this hole in the background for software developers, which is “Okay, so I want to write a piece of audio software. What do i do?”. But at least that would be for developers to resolve. And users will not have situations where, you know, “Well, I’m using Ardour, my browser won’t play audio”, stuff like that. Those sorts of problems would go away.Now, to do it like Apple did with CoreAudio They actually moved a lot of the kernel functionality into a user-space daemon. That raises a little bit of hope for me that maybe you don’t actually have to change the kernel and you don’t have to change ALSA. Maybe we can get this all right by having the right user-space stuff.I think we just have to wait and see what PipeWire looks like in another year, maybe year-and-a-half. And if it’s really getting close to meeting is technical goals, then I think at least the user experience of audio on Linux will get dramatically better.It’s been 5 years since JACK in not hardwired into Ardour anymore. Lately, I’ve been catching conversations on the IRC channel about the possibility of ditching JACK support entirely. Now, I do know that you used a bunch of smiley faces when you said that but you’ve said before you weren’t happy with how things turned out with JACK. So what’s up with that? What are the main reasons for your disappointment with JACK and what’s your long-term vision for Ardour and JACK?So the back story here is, yeah, I was the original author of JACK which got created out of code that came from Ardour originally. I still think JACK is an incredibly cool idea, if I say so myself.But It was created to address a very specific need which was the lack of plugin API on Linux for audio plugins. And the idea was that, instead of writing plugins people would write whole applications using whatever tools they want, whatever GUI toolkit, language And then we’ll just use JACK to connect them together. And this is going to be great. It means that the developers won’t have to coordinate as much as they do with plugins and digital audio workstations. The workstation developers won’t constantly be having bug reports from people like “I use this plug-in and it crashes”. And the plug-in developers won’t be getting emails like “I can’t use your plug-in in this new DAW”. It would be great!And JACK does that, and that’s awesome because the idea of sending audio and MIDI from one program to another just as easily as sending it from the program to the hardware or computer I mean, this ought to be easy to do. And JACK does make it kind of easy to do.But then we get into both the philosophical and just code development environment problems.One problem with JACK is that it means that you have what some people would call a modular setup. You got a bunch of distinct programs talking to each other. You don’t have any really good way to save the state of all the programs at once. We had attempts and they are not bad. But it’s still nowhere near as convenient as using a single program that has everything happening inside of it.And that creates an environment for a lot of users that, I think, is much more complicated than what they really want. Even the basic idea of JACK for a lot of new users, when they have to deal with that as the first thing they have to deal with It’s just not an obvious concept to them.They are used to a computer where you click on the button, probably in a browser, that has a little arrow pointing to the right and a circle around it, and it suddenly starts making sound out of the speakers and that’s all that has to happen.And then you say “Oh, no-no-no, there’s this cool tool, and it has a patchbay, and you have to set parameters' and all this other stuff”.Now, when you get deeply into it, when you are really involved in pro audio workflows and stuff, you start to understand, like, “Oh man, this JACK thing is actually really cool for some things”. But it’s really not something you want to put in the path of a person who’s just starting out using this kind of software.So that’s one problem with it.There there are some design problems, things that we never really did. Some things have been addressed by the metadata API that came out some years ago. It lets you, for example, put pretty names onto ports in JACK so that you know what they are. But we never really grappled with the issue of, you know, when you have a piece of hardware there, the hardware is normally identifiable by at least a name on Windows and macOS. And sometimes the channels have nice names. JACK works almost all the time to not do that.“I don’t even want you to know what sound card you are using, it’s just called ‘system’. And the ports are just 1, 2, 3, 4, 5, 6, 7, 8, 9 ”. So that’s not there.There are other issues, e.g. we never addressed looping in the JACK transport mechanism. Also, there’s no latency in the wire in JACK, which means JACK can report latency but it cannot do anything about it.So these are the technical problems that got in the way. And then there are, sort of, community development issues. Early on, St phane Letz started working on the multi-processing version of JACK. Which I thought at the time was a great idea. It was written in C++, it was going to do parallel processing etc. And it seemed entirely appropriate just to go down that route.There’s no point rehashing old history but St phane’s efforts which became JACK2 never really merged, they never united with JACK1. They are compatible at the binary level but they have different capabilities for users. It’s sort of a mess.I handed the control over the JACK project to Filipe [Coelho] a few years ago. I’m successfully completely ignorant [laughs] about what he is doing and what has been done. And I’m happy to keep it that way. And so JACK will do what Filipe and the rest of the project wants to do with it.You asked about Ardour and JACK Yeah, there were smiley faces. I don’t think we’re ever going to drop JACK support. We do continue to run into problems most recently, what I consider to be the crazy decision about threading where an application is actually processing audio and, in the middle of it, can be told things like “Oh, one of your ports just got disconnected” or “There was a latency change”.I didn’t agree with St phane about this at the time and I disagree with it even more now. But yeah, we’ll figure out workarounds for some of that behavior. And I don’t think the support for JACK is going anywhere.But we do now generally tell new users “You don’t have to use JACK. And in fact, if you don’t use JACK, your initial experience is going to be a lot easier”. That’s particularly true for MIDI devices. Most people using JACK2 have to go through some extra loops to actually get hardware to show up. Whereas if they use the ALSA back-end on Linux, it just works.So JACK will be there, we will suggest and make it more and more obvious that JACK is not the obvious thing for you to use.There’s been a lot of talk about session exchange between Ardour and other digital audio workstations. I know you’ve been extremely unhappy with OMF and AAF. And I know you’ve been so unhappy that you even promoted AATranslator which is proprietary software for project files conversion, mostly because it took that weight off your shoulders, I guess. But I recently heard Robin and you, I think, commenting positively about OpenTimelineIO (OTIO), although that project is more commonly associated with video production, mostly has adapters for software like Final Cut Pro X. Do you think OTIO is the way forward?I think at this point it is still a difficult question to answer. I only had a brief look at the API and the specification for OTIO. It’s hard to understate how much of an improvement it is over OMF or AAF. It is a vast improvement over what was going on with both of those “standards”. I mean, OMF is barely even a standard you can’t even really get the specification for it. I will also not that both of those two, OMF and AAF, are also primarily associated with video.I think that the critical issue with these formats from my perspective is, are they sane? AAF is not sane, AAF is insane. AAF is an example of what happens when you have not one but three committees. When you have ideas like “Ah, well, we don’t want to specify what the file system is, how files are represented, so we will use some Microsoft thing that lets you put a file system inside a file and that will be our way that we talk about files”. It’s just insane.There’s good signs that OTIO is a huge step up from that mess. So some of the worst elements of even thinking about doing AAF support go away.We are also living in an era where generally the open source contributor based mentality promises that, one way or another, that library and that standard will just get better.Again, I haven’t looked at it in enough detail, but the other part is how well they represent the audio aspects of what you need as a session format.If you only represent the totally straightforward non-linear editing bits to match audio like “This file asset source something, we need this bit of it and it goes at this point on the timeline”, well, it’s okay, at least you could do that for audio. But actually, to make it useful as an audio session format, you need a lot more information beyond that.I don’t know how far down that path they’ve really gone. There’s an open question about whether or not a session format should even attempt to try to represent plugins. It gets quite complicated because, you know, if you got AudioUnit plug-ins, they don’t run anywhere but macOS.I am not familiar enough with those audio-specific elements of the representation in the file format specification to really know how good of a future it looks.But if they can get those at least adequate, then it offers some hope that audio applications like Ardour could start maybe using and supporting that format.Which with AAF is never gonna happen, and even with OMF I don’t know why but Reaper’s support for OMF was opensourced. So I actually did a [git] branch a few years ago, added that to Ardour. And it turned out that, you know, it supports maybe 1/3 of all OMF files. It’s just another indication how appalling the specification and standardization is.So I’m cautiously optimistic. I think the other thing that’s interesting though is I’ve heard a refrain from a number of people that said that actually this whole notion of being able to move sessions between tools just isn’t really a thing for most users. Especially for whatever number of recording studios still exist in the world after this virus.You don’t want to make it easy to just take the session that you created in this recording studio and have them take it home and use a totally different tool to work on it.So, like, do the stem exporting and be glad of it?Yeah, be glad of it. But also this story of migration just doesn’t happen very much. The one area where I think it is more relevant is the video editing to DAW and back and forth. So basically in a movie post-production or even production stuff That does seem like a much more legitimate case to me where there’s one person working on the video side and they use this tool, and then we have this guy working on audio with another toolset. And it seems right natural in that context to go back and forth really easily.And so that is the one use case for this kind of support that really makes sense to me. The bit that doesn’t make sense is “What? I use Ardour but my friend uses Cubase, I want to be able to go back and forth with him”. It’s like No No! [laughs]Actually speaking of plugins Another thing you said in that LAC talk was something along the lines of, the fact that U-he were able to develop the Zebra synthesizer so fast, and we only have this many good virtual instruments means we are doing something wrong. But there is now an even more staggering example. The VCV Library now has 2,000+ modules. VCV Rack is also free/libre software, it has only been around for three years. LV2 has been around for well over a decade, and the last time I checked there were about 1,000 related repositories on GitHub, most of the plugins not easily installable, you have to compile some of them. What lessons do you think we could learn from VCV and how can we realistically improve the LV2 ecosystem?Well, let me first say that I have just an incredible amount of respect for what Andrew [Belt] has done with VCV Rack. As an audio software program, it has some technical issues inside of it that bother me. But As a user of it, which I am, it’s an amazing piece of software.But even more amazing than the software is the ecosystem he has built around it. Which you just mentioned. It got over 2,000 modules. The ease of getting these modules, of using them in the program is just completely exemplary. As well as the way in which the project has drawn in all those people who are actually involved with hardware modular synthesis and even other software modular synthesis environments and even people who are just completely new to the whole thing.It’s just incredible and I have the utmost respect for what he has done. And it makes me a little jealous In a good way? [laughs]Yeah, in a totally good way!I’m also very happy as a user of the program because it’s just all those incredible modules out there to do really great things with.Now that being said, there is one element of what Andrew has done with VCV Rack in terms of facilitating modules a lot simpler. Which is when you run a module for VCV Rack you are literally running a module for VCV Rack a plugin in many, many ways by comparison with other sorts of VST plugins or AU plugins. But it is actually a plugin for a specific program, not a whole suite of programs. And so in one simple move you’ve just wiped out a huge number of the problems that plugin developers have to face, like, “Which host am I targeting?”.In fact, Andrew hid a lot of platform-specific stuff too. Of course, someone still has to build and compile plugins. So developers who want to honor the three platforms tradition of VCV Rack Linux, Windows, and macOS still have to grapple with the fact that they have to build them for people to use. But when you write them, there is almost no platform-specific stuff.So he’s done something that’s mildly equivalent to what is increasingly happening in the web browsers now they are entire development environments.VCV Rack is this completely closed system, and I mean it in a good way, not in contrast to libre/open aspects of it. It is just what it is, and when you write a module for VCV Rack, that is all you are doing.You don’t have to think which graphic toolkit you should use, what are the subtle details of how you do critical sections on Windows vs macOS. All that goes away. You just have a pretty simple SDK to work with, a brilliant design of how the module graphics should work.And that gets rid of a lot of the complexities of doing actual traditional plugins. I think that’s one of the reasons why it’s been so successful. I think what Andrew did with the environment and the ecosystem has made that really easy to do. And I think that it’s not really very easy to do that if your target is not one host program or platform.Also, no audio plugin API completely specifies the graphics part. And so every developer has to deal with this question: “How am I gonna do the graphics”? And on Linux, that’s even more of an issue than it would be somewhere else.So to me, one of the lessons is that if you build a much more constrained environment, but it’s still very powerful, lots of people will be interested and lots of things will happen.Now, it doesn’t happen all the time. Reaktor, for example, normally should have been just as successful. As far as I can tell it just isn’t. Lots of people even like Reaktor and use it. But Rack has just exploded, it’s a meme, it’s part of what people are doing now.But the other thing is that it’s open, it’s libre, and it’s cross-platform. And I think that also drives interest because any developer who starts thinking “Maybe I’d like to do a module”, it makes no difference what platform their users are on. Users don’t have to care.They [developers] might. Although maybe Andrew has taken care of that in the website back-end, maybe all you need to do is to upload the source code, and it will build the modules for them.So constraining environment but making it super powerful and cool and unconstraining the project and making it so that people who want to get involved in that are not sacrificing anything by making a decision. And guess what people will just flock to it.I’m not sure that we can really take that model and use it for example for plugging into your digital audio workstation unless you’re talking about a super simple one.I think you could argue that a tool like LMMS in Linux world I mean, LMMS is a much simpler kind of environment than Ardour, Logic, or ProTools. And I think you could make a case that you could come up with a design for a very constrained digital audio workstation that would have the same kind of appeal, maybe it would have its own plugin format, it was cross-platform, all graphics got taken care of the same ways as modules are in Rack. But that would be a very constrained tool that would be of no interest to anyone.Also, for users, there are tools like that out there. But VCV Rack was new, it wasn’t just done in all the right ways, but it was something that didn’t really exist until it came along. It does share some similarities to pre-existing software. But it just did so many things so right that it opened up new ground for people, that’s why it is so successful.With something like LV2 which is another plugin API in a sea of plugin APIs, one that doesn’t really solve many if any of the problems that plugin developers are gonna face, I think we just can’t have that kind of success in the world.What I think is encouraging is the fact that there are now several plugin development toolkits that make it almost as easy if not as easy for developers to do Linux VSTs, for example, as AudioUnits or Windows VSTs and maybe generate LV2 plugins as well.And that type of thing does open up the possibility for a developer to generate a plugin for everyone. And maybe that levels the playing field a little bit more. Maybe there’s more people who want to do really cool stuff, and the end result will be that their work is available for everyone. Not necessarily without cost, not necessarily open source, but available.So cross-platform plugin development toolkits are more analogous to what VCV Rack pulled off than the plugin APIs themselves.But yeah, Rack is just I have to avoid using it because I could just waste most of my life [laughs].You know, actually, sometimes it feels like Ardour is a testbed for experimental LV2 extensions. It was first to use the mixer inline extension. Then there were extensions for matching dpi and colors between host and plugins, if I’m not making this up :). And I don’t believe it’s going to stop at this. How comfortable are you with it?Oh, I’m pretty comfortable with it. I think this is part of a whole point of LV2. And what has happened with that in Ardour is, I think, a really good demonstration of the power of the design of LV2.Which is that all it takes is an agreement between one host and one plug-in or one plugin developer. And all of a sudden, you can add cool new functionality without actually having to change the API or meet with some standards body or anything else. There’s no equivalent of that whatsoever for VST and AU, you can’t do that.And as we work in the context of a digital audio workstation, most of the proprietary DAWs have their own internal plugin API. And they come with their own proprietary plugins that come with the workstation, and so they can do whatever they want there. They don’t have to publish it, they don’t have to talk about it. They can just create whatever mechanisms they need for the plugin and the host to communicate and do whatever they want.With Ardour, we sort of use LV2 that way but we are not corrupting LV2. We are using it in a way that was intended.Another example of what you mentioned, one of the things that we’ve worked on is the plugin-provided MIDNAM data. So MIDNAM is this format where you can describe the names of MIDI notes, the names of patches and programs, and generally annotate a MIDI-controlled synthesizer.But MIDNAM files are normally just files. So we need a mechanism for the plugin to tell a host: “Here is my MIDNAM”. So we now have an extension for that, and plugins can provide a MIDNAM. We provide it in the general MIDI synth that we include, and there are at least a couple of other LV2 plugins that now provide MIDNAM files too, I think Helm is one of them.Doing this in the context of any other plugin API is possible. It is something that Reaper has done with VST, for example. They have at least half a dozen extensions to the VST API. These are not part of the official standard and not fully documented. So it’s just Reaper waving a hand and saying “Hey, we do this!”.Whereas with LV2, you can actually say: “Here is the spec for this extension. And it’s written down and this is how it works”. So I think this is a great thing and I don’t think it’ll go away.Do you think it helps that David Robillard [main developer of LV2] works for Ableton now?As far as I know, David keeps his work life and his open source life fairly separate. So I don’t think his work inside of Ableton really helps the LV2 situation very much.But I think David is a great asset to what LV2 is and can be. And I think the fact that he works in the context where he gets to see other aspects of how this whole audio software world works probably can’t hurt.There have been several attempts to build commercial Ardour offsprings. As far as I can tell, Mixbus is the only one still alive and doing very well after over 10 years in business. Apart from contributing code to Ardour directly, they also employ Robin Gareus, whose level of involvement with Ardour development, I dare say, rivals yours. Do you think they’ve been successful because they are so heavily invested in improving the upstream project? Or are there other reasons at play?As you said, there have been a few attempts at doing some commercial spin-off from Ardour. Mixbus was the first. The second one would never really got off the ground was with Solid State Logic. And the third one was working with Waves Audio on the Tracks Live project.What I can say by comparing all these three is that there are some important things about Harrison’s approach to this that helped along the way.First of all, I assume it was Ben Loftis at Harrison who understood and had a very clear picture in mind early on of how they could do something with a GPL piece of software and make a product that would work for them and be viable. Maybe Ben didn’t really know at the beginning, but it might be.And I think, in contrast, Solid State Logic really struggled with this idea: “How could we ever do anything with this if it’s open source? What can we do? This doesn’t make any sense!”.There were other reasons why SSL dropped their involvement with Ardour. You know, giving them the benefit of the doubt, relatively understandable and not really a verdict on Ardour itself.But they really struggled, they couldn’t even really come up with what the starting step should be. Because they couldn’t imagine what the end was gonna be. Whereas Ben and Harrison in general could do that.Behind the scenes, Waves have been very strong supporters of what I’ve been doing for the last 15 years. And I really like some of the people who work there a great deal.Waves Tracks Live did become a product and they did release it and it did get good reviews. It was a part of a strategy for them to get into their own digital audio workstation. But it was decided that instead of jumping all the way into a studio one it would be better to start with a more constrained thing.Track live was intended for use in live situations. And as an aside here, there was a really great experience for me. Before I left Philadelphia, I got to see Nils Frahm, German contemporary piano musician whose music I like a lot. And as I was standing up on the balcony level, I looked down and realized that his audio engineer was using Tracks Live to record the whole thing. I went up to him afterwards and asked him how it works. He said: “Oh, it’s great! I does just what I need! I love this software!”. And I said: “Well, I wrote that”. And it’s a whole different experience when it’s happening at a concert of somebody whose music you really like. That was cool.But anyway, it was a somewhat successful product that is still available. I think it doesn’t even cost anything anymore. I don’t think they charge for it. But Waves also had this problem. They didn’t really quite know where they wanted to go.They weren’t even so much as appalled by the GPL. Their reason why they wanted to do this wasn’t “Oh, we are gonna make tons of money by having a new DAW”. Their goal was “We wanna do certain things, we are a plugin company right now for the most part. So we want to get to a place where we can try cool ideas and some conceptions we have for some new technologies. We can’t do that unless we control a host. So we need our own host, a workstation”.Yeah, it looked a little like they needed a DAW for their own MADI interface.Yes, but there’s also some ideas that they have You know that I signed an NDA form and can’t speak freely, so Some ideas to do with mixing and how to get really great mixes on a song and so on. But if you don’t control the whole DAW, it’s difficult to do.Waves wanted to go down that path because they had some ideas for things that they felt they needed in their own workstation. And they had already been involved with me in various ways over a number of years, and they said: “We like the idea of Ardour, let’s start from there”.But Waves had a different problem: the developers that they hired didn’t have any experience with cross-platform desktop GUIs, although their entire plugin team uses Qt. They were used to using rapid application design tools where you essentially mock up the entire GUI, and it spits out a bunch of code, and nobody ever sits down and says “Oh, put this button here” etc.And this approach has a couple of benefits. One of them is that you can have a designer working on the GUI, and they don’t have to write code, they hopefully know about good design. I’m sure Martin aka “Tantacrul” would disagree with this! [laughs] But anyway, with this approach, you don’t have to frequently rebuild the software when the designer decides to change the arrangement of things. The other benefit is that you completely separate the design from the application’s logic.So the programmers they hired were used to this approach. When they started looking at Ardour where everything is done in code I wouldn’t say it blew their minds, but it’s just the concept they couldn’t really deal with. And because they wanted to really change some really significant elements of the user interface Which they did Which they did This is a really big culture clash and it made it very difficult to integrate what they were doing with Ardour’s source code.And although Tracks Live was successful and although there are some things that came out of that work that are in Ardour now, we were never able to really connect and stay connected.With regards to Harrison, that is exactly what has happened. Harrison’s commitment to the GPL, to “we are not gonna to redo the whole GUI” Ben probably pulled his hair out multiple times, “GTK! I can’t believe this!” [laughs], but, you know, they’ve been willing right from the beginning to just say “No, it is what it is”. They understood that switching the toolkit was not going to solve any problems. And they’ve been willing to be deeply integrated into the Ardour process as well as the Mixbus process. And so the two projects have been able to evolve side by side.What used to happen with Tracks is that they would give the work they had been doing for a month, and then I would have to spend two days on my git merge. It was just a complete mess.Robin does the merges between the two [Ardour and Mixbus] and I have never really asked him in detail how it is. But I get the impression that 90% of the time it’s completely easy. And in fact I know they’ve even put more effort into trying to make sure it is easier instead of more difficult.So I think, with Harrison, they had a vision for how they would be able to do something interesting and cool and useful and hopefully helping them to pay for what they do. They had that from the beginning. They have been willing to deal with the technology that is in Ardour as-is, particularly all the GUI stuff. And they’ve been happy and able and willing and really valuable to be very integrated into the Ardour development process.That’s why I think those three things made a huge difference. Even though Tracks Live was when it initially came out probably more successful than whatever Mixbus version was then. So it’s not like you can’t have some momentary success but I think part of the reason why Mixbus has that long-term success is those three things.Combined with the fact that it’s a pretty cool idea and for a lot of people it sounds really good. You know, they just put their sessions into the Mixbus and it sounds more like they think it ought to sound.So these four reasons combined are part of the reason why they’ve been really successful and the other attempts to do this are not. I’m not saying no one else can do that as well in the future but I do feel very strongly now that somebody has to have those first three conditions or it’s not gonna work.They got to understand how they are going to work on something available under the terms of the GPL.They got to be prepared to accept the GUI technology for what it is.And they should be willing to be fairly integrated with what we’re doing. Because otherwise things just diverge and they can never be reconnected.If there are companies are out there that have ideas and they think they can work within those constraints, then I’m all for more collaborations or even just forks. But without those three I think it’s probably not gonna work elsewhere.All interviews take time. There's always the research stage, conversations on and off the record, editing especially in podcasts etc. The thing that really consumed a lot of my time here was preparing the transcript for people who would rather read text. So if you enjoy the work I do, you can support me on Patreon or Liberapay, or make a one-time donation.
Podcast ep. 002 - Paul Davis on the deep rewrite of Ardour
This is the first part of a long interview with Paul Davis, founder and lead developer of Ardour, free/libre digital audio workstation.<iframe width="100%" height="300" scrolling="no" frameborder="no" allow="autoplay" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/835416727&color=%23ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false&show_teaser=true&visual=true">Hello Paul!Hi Alexandre, how are you?I’m fine. How are you doing?Great, thank you!So you recently moved to New Mexico, is what I heard Yeah. It’s very different from where I used to live. But enjoying a lot so far. Unfortunately the mess you can see behind me is an indication I’ve only been here a year and unpacking all this stuff hasn’t been a very high priority (laughs).Do you get out much these days?Not as much as I was a few months ago. I mean, we live in a remote isolated place, so the virus is not really having a lot of impact here. But I’m not really running or cycling as much as I was.On the other hand we have a we have a fairly large property here, and I’m building solar panels and doing gardening, and so I’m spending a lot of time outside.So you are going all green? :)Yeah. I’m hoping the panels I put in mean that we will be net zero for electricity next year. Not sure that will work out but that’s the goal.Great, so I think I’m probably going to do everything wrong and start with a question that is unrelated to the new release. A few months ago, there was a very insightful thread in Ardour’s forum about extensibility of digital audio workstations. If I was foolish enough to try summing it up in a few words, I guess that would be your concern, whether making Ardour’s source code open and free to modify and distribute really benefits the vast majority of users in a way they can appreciate. The example that you used was Reaper, a non-libre commercial DAW that is so much extensible that people can make very clever hacks without ever having to touch the core source code. Which they don’t have access to anyway! Now that you’ve had some feedback, what would you say is your takeaway from this discussion?I think the response was quite interesting. There were some good points made by people coming from different perspectives with different answers to the questions I was asking.I don’t think the discussion really changed my mind a lot or moved me closer to having a definite answer though. I think your summary is concise and correct. I think you know the heart of it really came from a conversation with someone who’s very familiar with Reaper, who wanted to do various things with Ardour. People can get mildly upset that you have to do it with the source code rather than just having some script.The conversation sort of pointed out the merits of both sides. The open-source side is really good and if you can have incredibly expensive powerful scripting capabilities that’s really good too. And if you can have both of them available in the in the same project, then you just got the best of all possible worlds.Unfortunately I think that although I have never spoken to anyone who’s involved in Reaper about this, I suspect that their decision to have the scripting be really deep and really powerful was something that they decided to do really quite early on in the program’s history. I don’t know if it was in the very beginning because I’m not even sure Lua was an option when they began. But I think relatively early they made the decision to do that and I’m not so certain that retrofitting the level of scripting that they have into a project that didn’t make that decision early I think that might be very difficult.So what the future holds for that question, I don’t really know right now. I know I would like to see Ardour having even more powerful scripting than we are already doing. And what we have is pretty powerful. But exactly how we would do that and what that really means I’m not sure.But would you stop at continuing to integrate Lua? What I see is that Robin Gareuscontinues expanding the coverage of API for Lua. So would you stop at Lua or maybe you would try to introduce support for FAUST scripts, for example? Because there was another interesting discussion at cdm.link, if you remember, with Artemio Pavlov of Sinevibes. And you made a very interesting point that while Korg’s SDK is kind of fun to use and easy to deploy, it’s not as powerful as FAUST. So do you see any future for FAUST in Ardour?I haven’t really thought too much about direct integration of FAUST. I tend to think of FAUST as a language to write DSP in. So in my conception of the role that FAUST plays, it’s more to do with whether people can write plugins easily using FAUST, and then whether we can load and run those plug-ins.The reason I say that is because Ardour itself doesn’t really do much DSP. We don’t have built-in source code that does anything like that really. The most expensive DSP operation that we do is the fader (laughs). That’s the one part of Ardour that is handwritten in Assembler because it’s actually really expensive, well that, sorry that’s not the fader, that’s metering that is written in Assembler.But Ardour itself doesn’t really do much DSP, and so the context in which most people would add DSP would be a plug-in. Now, Ardour has made it possible to writea plug-in in Lua, and I think that provides some justification for maybe saying “Well I’d like to be able to do that, but with FAUST”.Since it can be done in such a relatively contained way, especially ‘cause FAUST has a just-in-time compiler, I think we just come down to whether on not somebody steps up and says ‘Hey I’m going to implement this". And if they implement, I can’t see a reason why we would say “No, we won’t merge that”.On the other hand, again, I’m not really sure quite what types of DSP somebody would want to do in that way. Whereas I can imagine lots of reasons why somebody would wanna write an actual plug-in using FAUST. And if they do that, that doesn’t really need us to know anything about FAUST.Okay, so you have obviously spent a huge amount of time rewriting how Ardour implements some very basic concepts like time. So I guess my question is, how happy are you with Ardour’s architecture? Is there something you think you still really need to rewrite? Or are we basically good to go for another decade or so?I think the answer to this is complicated. I think that the changes the Robin and I have done over the last two and a half years have addressed a lot of basic issues that have been there pretty much since the beginning of the program’s life.We now have complete latency compensation inside the program regardless of how you route a signal any way through Ardour with auxes, with track to track, track to bus, sends Everything will always be latency-compensated and all properly aligned.I know it sounds like something that you can just have on top of what was already there. But to do that properly really involves rethinking completely what was happening while we were processing audio. And we’ve done that and I think the basis that we have for that is hopefully good for at least a decade.There are other changes that we’ve made. I think some of the work that I did, well, I guess, it was really two separate features. One was cue monitoring which is being able to listen to both what you’re inputting and what’s on disk at the same time. And the other feature being wet recording where you can actually record the processed input rather than just the raw input.The work I did to those also involved some pretty deep changes in how things work.Specifically the signal flow, you know, going through a track or through a bus had to be revisited, and we’ve done that and that put us in a good place.So most of the things we worked on as part of build-up to 6.0, I think, put us in a good place for the next decade.But There are things that we didn’t work on as part of the build-up for 6.0. And those things I still have concerns about.What things?Well, the two most significant ones One of them was something we hoped to include in 6.0. We decided that we didn’t even really fully agree on what the correct solution was. And so we decide to drop that it’s how to represent and manipulate musical time.I spent a long time working on a development branch, and it was very complex. There was a lot to think about, there is still a lot to think about. This is also connected with ideas of what you do with the tempo map.Oh, was that the nutempo branch?Yes, that’s the nutempo branch. And the main goals of it were Whenever you do operations in a particular time domain, and by time domain I mean are you talking about musical time like bars and beats or audio time like samples and whenever you do an operation, like, I want to move a region a certain amount later on the timeline. If you specify that in, for example, bars and beats, then it moves exactly that number of bars and beats. Not some approximation, but exactly that number of bars and beats. If you move it a certain number of samples, then it moves exactly that number of samples. And so we keep the time domain that is relevant. We stay in that time domain as much as possible.That work did get to a fairly advanced stage, it was all done, the program would run. There were certain bugs that got fixed because of it. But there were a number of other aspects to it that Robin and I didn’t ever fully agree on. We will, it’s not a big problem. We deferred that thing for the time being.The other part of that work that is also really complex is the tempo map. Because part of the goal of all this was to allow a much more powerful tempo map including what are known as I believe I got this right rubato sections, where you got essentially no time. You are playing and all of a sudden you can play in whatever time you like.So a lot of that work has been done. A lot of the conceptual work has been done, a lot of coding work as been done. But it’s not in 6.0.I think that until we get that stuff into the code base, it’s going to have problems. So that’s one thing where I think there’s still significant amount of work to be done although I hope I’ve done most of it.The other area that I think we just don’t know because we haven’t really tried to do it very much is what I sometimes call groove-centric or beat-centric sort of workflow. This is basically Ableton Live, FL Studio model of doing things.And because we haven’t really tried to do that yet, I’m not completely confident that all of the data structures and the APIs and everything that we have inside of the program are where they need to be to fully support that work. I’m saying that partly based on a little project that I started and probably won’t finish.A couple months ago, I came across this really cool piece of hardware called noodler (NDLR). And it’s just a little box to generate MIDI. I think it generates four outputs, there’s a drone, and then there’s a chord output, and then there were two sort of melodic outputs. And it’s a really cool little box that you use to generatebackgrounds that you might then play over just to mess around with ideas and see what happens.Some ambienty stuff?The examples I’ve heard used the most are sort of ambient, yes. You sort of get big pads and then some sort of subtle maybe bass groove or some ostenato or arpeggio type of thing going on.And, you know, being a programmer, I thought: well, I could spend three hundred bucks on the box or I could just write the software. Wouldn’t it be much more cost effective to just do that?So I started working on a version of this, and part of the reason for doing it was that it was small and standalone and it was going to let me just play around with some of the issues that we have, if and when we move towards this beat-centric stuff.And as I started playing with how was I was going to do that in this little program, initially I thought, this is going to be great, I’ll do all this and then we can use it in Ardour. Then I more and more found myself thinking: “Hmmm, I’m not even really sure I know the best way of doing this in a little porgram, let alone the big one”.So that also raises questions in my mind that as we move towards doing that kind of thing, I’m not quite sure that we’ve got the architecture right for this kind of thing.But I think that that would be an additive process when we do it. I don’t think it’s going to involve tearing up all the stuff we have. I think we’re going to need to add some API and some structures and stuff like that.So I think, in general, we’re good hopefully for a decade, perhaps even more.But there are going to be some additions that will come, I hope, reasonable soon to deal with those areas.I have a little fun story to tell for my next question. So there is this new project, Olive, a non-linear video editor. And the guy who started writing it understood early on that he made some bad internal design decisions. So he started rewriting everything using the right architecture, reusing existing componenents like libltc, OpenTimelineIO, OpenColorIO and so on. Making his software more of a glue between all the right libraries. And at some point, he disabled the bug tracker so that people wouldn’t bother him with feature requests. But one guy found a loophole. He basically created a pull request containing changes between two branches and used the comment section to ask for stuff. So what’s your experience? How much pressure did you get from users during the time of the rewrite? Especially since it’s been two and a half years since the release of version 5.12.I would say that users have been incredibly understanding. I would say I felt very little pressure. I can’t speak for Robin but I suspect he feels the same way, certainly as far as Ardour users. I think people generally seem to have understood what we are doing and just stayed quiet.People would show up regularly on IRC, the chat channel, to ask what the state of things was. But in terms of people showing up and saying “Are you going to do this and will you please do that”, that didn’t really happen. In fact, what I have noticed is it’s probably been about a month sinceWe really should have started saying hey we’re getting very close to 6.0File bugs, we want to hear about them.And that has resulted in a lot of new bugs filing and feature requests. And in most senses this is all good. Particularly the bug reports are invaluable because we don’t want to release this with huge things that we just missed.The feature requests coming up now are a little difficult and challenging. If people were right in front of me, I’d be, like, “People! We’re trying to release this software! We are trying to tidy up loose ends to get this out. I do not want to talk about your incredible cool new idea for something”.However, that being said, it doesn’t really feel much like pressure. Part of the point of putting things in a bug tracker is that they stick around and they persist, and we can come back to them in a month or two months, 3 months, or next year or whatever.So I would say overall that the user community has been really great and very understanding. In fact, I’m really amazed there haven’t been more threads and more comments and people saying, like, “What the hell has happened to this project? It’s been a year, it’s been tow years, it’s been 2.5 years since the release ”.People seem to have understood what has been going on and made it possible for us to do that work without feeling unduly pressured by it.So that’s been really great.I also noticed that you bumped the monthly limit of donations one or two times during that period and every time you got a 100% coverage for your expenses. Which also goes to show.Yeah, the financial side of Ardour right now, especially during this virus period I know we are not the only project or piece of software that is experiencing a raise.Let’s just say there’s a lot of people at home right now who have decided to try to make music.Whether this is a new level or whether it will drop off as hopefully does the pandemic situation, I don’t know.But the financial side of Ardour I try not to be too proud of it. I try not to be proud of most things really, but it has been really remarkably successful.It’s not successful in the way that Reaper is successful, in the way that Ableton Live is successful, and generate anything like their revenue. But in the free software context, we bring in more than a hundred thousand dollars a year without doing most of the stuff that companies are supposed to do in order to do that.And I’m incredibly grateful to all the people, both the people who pay one time or the subscribers who just make it possible for both myself and to a limited extent for Robin as well to carry on viewing this as an actual job and as a thing that we do full-time and not squeeze it in around the edges. [ Editor’s note: for clarification, Robin works full time on Ardour with additional funding from Harrison Consoles. ]So that’s really great. And the moment, if the virus situation carries on, then we face a more difficult situation. I mean, it’s a good problem which is whether or not we actually try to hire someone in the future, probably not full-time. One of the things the program is really lacking right now, and maybe the biggest obstacle to many people using Ardour properly, is that we do not have enough good documentation and enough video tutorials.So one of the things I’m thinking about with the optic in revenue that happened with the virus is maybe trying to divert or use some of that to try to help address that situation. I think the problem is, there is so much functionality that people just don’t know about. So many things that you can do, and people have no idea it is in there. I just think it will be a benefit to all users, both the new ones and the long-standing ones, if there was just better documentation and better tutorials.That sounds like a really good plan. Okay, let’s talk about GTK. You’ve been steadily moving away from it for the past several years. Ardour now only uses it for packing widgets, for the file dialog, and for text input. And if I remember correctly, you were going to switch to using a constraint-based layout manager like emeus in the future. Why was it neccessary to start replacing GTK+ with your custom code?Well, it’s a very long story, 20 years long in fact. There was a number of things that sort of got the ball rolling. I suspect that the most important one was the issue of the canvas which is the thing we used to draw editor where you have tracks etc.GTK never provided anything suitable for doing that. So even in the earliest versions, we had to use separate canvas object in order to do that kind of stuff. And we used to use something called gnome-canvas. It was one of the 5 or 6 different canvas libraries that existed for GTK. And as we used that for a while, it became clear that it wasn’t really quite what we needed. And so a guy who was very involved in Ardour for several years, Carl Hetherington, took the task of writing our own canvas object that would be tailored exactly for what we needed in the context of a DAW. So that was the first big break in the sense that you now have a situation where depending on how you use Ardour, 80% of what you’re looking at now is not GTK. It’s a canvas object with things going on on the canvas.The second part of it is sort of a combination of There’s two ways of talking about it really. GTK is a desktop graphical user interface toolkit, and so it features buttons and text entries and dialogus and a bunch of other things.The problem is that an awful lot of things it provided are just not really that useful in the context of creative software. And worst of all, they don’t actually work in the way that you’d like them to.I think the simplest example I can give this and it is a little bit complicated to explain but There’s an idea in software engineering called Model-View-Control (MVC) programming. When you are talking about GUI, you have a button on the screen, and you click the button. And when you click the button, the user is making a request to change the state of something, like mute this track or solo this track or turns this on, turns this off. And that’s all they are doing.It may be that the request can’t be satisfied. It may be they’re asking for something right now that is impossible. The button is also trying to display what the current state is. It’s a view, not just a controller.And the problem with toolkits like GTK is that they just weren’t written with this idea in mind. So when you click on the button, e.g. if it’s a toggle button to turn something on and off, you click on it and it immediately toggles. It just changes its visual appearance to say “I’ve been toggled”.But the truth is, GTK and the button don’t know whether anything has really happened. They only know that a user clicked on it.So we had a bunch of these widgets that, although they work very well or certainly adequately for certain regular desktop applications, they don’t work if you want to use MVC. So we’re also moving away from it because we need to do our own buttons and we needed to do our own drop-downs and all these other things that we needed to replace to make it work.Well, we could make the GTK ones work but it involves just stupid levels of hacking around. So the things we still need GTK for are text entry, file dialogs, menus, and tree views.Re-implementing any one of those is a massive task. Text entry to me is the most subtle one. People don’t realize that. You know, you see a little box on the screen, you start typing on a keyboard, character shows up. I mean, how complicated is that?Even as a developer, it seems pretty obvious. You get a message that user pressed the L, so we put an L in the box. No, it just doesn’t work like that!No, not so easy!At all!And you have right-to-left languages etc. There’s just so much stuff associated with that, that to do it right If you looked at the code that does that kind of stuff in GTK, it’s just a huge blob of code. And we really don’t want to have to reimplement that.So we are at the stage where we’d like to avoid doing new interface work using GTK stuff as much as possible. But at the same time I don’t know if anyone really wants to make a commitment to reimplement any of those four things.So I think GTK will stick around because we need those four things. But new dialogs we’ll try to do in different ways.The other issue with GTK, at least with the older version that we use, is that it uses what’s called a box packing model. So when you’re laying out the screen, the model is just taking rectangular boxes and stacking them either vertically or horizontally. There’s different ways to construct a user interface, this one is not good, it’s not bad, it is a mechanism. There’s some things it does right very well, there’s some things it’s not very good at.I think GTK4 has greatly extended their box packing model because of the problems that it faces. The one change I like the most is, you know, if you have a piece of text that you want to display on the screen, and you got a certain area for it, imagine of a long piece of text. And you’ve got lots of width. The obvious thing is, you are going to stick text on one line. On the other hand if the space you want to display is tall and narrow, you want to display wrapped on multiple lines.So one of the things they added in the new versions of GTK is the idea of asking something: “How tall would you be, if I made you this wide?” or “How wide would you be, if I made you this tall?”. This is one of the ways that they started addressing some of the problems with box packing.The other model is called constrainted layout. The idea has been around for a long, long time. As long as constraint programming itself. It got a lot more publicity when Apple added support for this to their own native UI.I think they started it with iOS.Yeah, I think they started with iOS and then moved back to the regular UI kit on macOS. And with this, instead of stacking boxes and stuff you basically say things like:This needs to be to the left of thisThis needs to occupy the half of this spaceThis needs to always be one pixel belowSo you set all those constraints and say: “Alright, I’ve told you the rules. Figure it out”.What I love about the solution is that with constraint packing you really could do anything. If you wanted to do per-pixel layouts the way that a few people still do, then you could do that with constraint packing. But you can also do these creative versions of it.And there is a library, emeus, that exists to do this type of thing. We didn’t switch to it as part of 6.0 because the only good wrapper for using libemeus in C++ needs a newer version of the C++ compiler that we don’t currently use.I made the executive decision that I think Robin has mostly agreed to, which is after 6.0 comes out, we are going to shift to the incredibly new C++ 2011 version. Only 9 years old! (laughs).But that will allow us to start using that library. And my hope is that will allow us to write our own packer for the canvas. Will that will let you do anyconstraint layout and that means we can use a canvas to start doing new dialogs and new arrangements of things. You could imagine, for example, mixer strips which right now are the GTK box packing. And you could imagine this whole thing is gonna be the canvas. And we’ll do this with a constraint description where anything goes.We have to go to C++ 11 first and then we can bring in this new library and then I can start playing around with, you know, how that works and what that will let us do.Again, there’s no reasonable time spent on which we can get rid of GTK. But I think this will allow us to fulfill the goal of not relying on it for new visual interactions.Giving everything you’ve just said, if you were to start the Ardour project today, what would be your technology stack? Would you go for the web stuff like the GridSound project you probably heard of? Or would you use something like DPF? What would you do?The most obvious choice to me right now is JUCE. It’s a library that was created in the context of building applications like Ardour. It is cross-platform in a way that e.g. GTK and Qt still struggle a little bit to be truly cross-platform.There are incredible differences between the way things work on the Mac, Windows, and Linux. I believe JUCE handles this better but nobody I know who’s associated with the Ardour project has ever looked in great detail at JUCE.And the one thing I know from talking to developers in other audio tech companies is that all GUI toolkits suck basically. Well, ‘suck’ is the wrong word. All of them have their own issues. And when you start using them for something as complicated as a DAW, you start running into them. So I don’t know for certain what it would be.The other path to go down Yeah, DPF would be a possibility. And there’s another one of Robin’s projects, Pugl, where you probably wouldn’t do GL anymore just because GL seems to be fading away. So… Whatever the cool 2D graphics layer is Vulcan, Metal Something very thin on top of that to let you handle events, and then a bunch of widgets, buttons and blah blah blah.The problem is, as I said, I don’t want to rewrite menus and previews and so on. So they would need to be part of the toolkit.I know one that quite matches that description right now. I do think that for something like a DAW for what I know about the history of Blender, for example Pretty much what they did was they said they were not going to rely on anything else but GL on the bottom and then we are gonna build all those things. And then at some point in history or Blender someone said ‘Oh my God, this has become crazy. We need to actually turn this into a real toolkit. And I’m not sure whether they’re still actually even using that GL-based approach at this point.But that sort of direction. We’ve got a lot of 2D drawing API, and we’ve got some mechanism for handling events. And we are just going to build everything on top of that and move away from these desktop toolkits.That would be the other option but, again, we did a port, a long, long time ago, 18 years ago now, from GTK1 to GTK2, which is kinda rather similar. That took between 6 months and a year. And that was when the code base of Ardour was maybe less than half the size that it is now. And I know that if we ever tried to port to another toolkit, it would probably take two years to have it working.From my perspective, that would be not worthwhile. That would just be wasted time. Even though there would be some users who would look at the final results and say “Oh my god! That’s so cool! It looks so much nicer!”. For most people it would just be, like, “What they have been doing for two years?” (laughs)Sounds about right :) So you recently had a conversation with the guy we know is Tantacrul. He’s now the head UX/UI designer at MuseScore. Any chance of beans spilling?At the moment, there isn’t really much to say. Martin and I had a brief conversation yesterday. It was constantly interrupted by connection issues. I was out in town and was using my hotspot and my battery run out, which was pretty embarassing, and then the cell phone company dropped Anyway, we were going to do it again this morning, and other time scheduling issues came off. We’re ‘going to do it again in the near future. Maybe they’ll be some stuff to talk about.He’s a fantastic guy. Anybody who hasn’t seen Tantacrul’s videos on the design of Sibelius, MuseScore, and, recently, Dorico It’s worth watching. Funny but incredibly insightful analysis of things.And even from my brief interaction with him it’s just clear to me that he’s just an insanely smart guy when it comes to user interface design. And that’s the kind of person that no offense to anybody who’s been involved in Ardour but we don’t really have anyone that has the background that he has.Even the people that we have who are good interface designers you know, Ben Loftis at Harrison springs to mind he had a lot of fabulous insight some of which we unfortunately ignored when we shouldn’t have. But I think Martin brings even morereally deep-scale understanding of how to do things.So I’m hoping that even if it’s just, you know, informal, once-in-a-while conversations or interactions via some medium, that we might benefit a little bit from the kind of insights that he has. Even if it’s just encouraging me or somebody else to just do more user testing which is one of the things that he’s been so good at.Would you ever do the traditional UX focus group testing?I don’t know how we would do that It actually becomes somebody’s job, and that often more than one person’s job to do that.And I don’t think we have anyone in a community right now who could actually do that or would do that.Well, I mean, like, hiring an agency of sorts.Yes, but I think I mean one of the reasons I’ve been hitting Martin, I mean, even from my brief conversations, I clarified it, his job at MuseScore is actually a real job, it’s a full-time gig.I don’t know, how much time he will have. But one of the things I was interested in talking to hin about originally was, you know, “Hey, are you interested in doing this sort of thing for us?”. It doesn’t need to be Martin. But watch his Dorico video, from a few days ago. And some of the insights that you could see being gained so easily from just watching new users trying to do stuff I watched that and I thought: you have to be stupid to not want to gain those kinds of insights.
After almost 17 years in the making, Inkscape 1.0 is out. Let’s be fair: this is one of those cases when the version number doesn’t nearly represent what’s actually in the box. The software was perfectly usable right at the point of forking from Sodipodi back in 2003, I’m speaking as the eyewitness here. The v1.0 release should’ve happened years ago but the team took a very conservative approach.Personally, I stopped using Inkscape 0.92.x and switched to what later became v1.0 a year ago or so. So far, it’s been a good run. I still have personal beefs with some UI solutions but I realize it’s partially due to using a toolkit that is OK for generic desktop applications and not exactly stellar for specialized software. My personal impression, if you are interested, is that the current team is highly motivated to push this project in the direction of making it a better tool for illustrators. But user impressions are one thing and what developers actually do think is often a whole different thing. So this is an interview time.I guess there’s just one disclaimer left to tell. Answers to my questions arrived after final GSoC slots had been announced, so at least one of the questions might not make as much sense to you as you’d probably expect to. Oh, and all the illustrations below are close-up parts of the about screen for Inkscape 1.0, made by Bayu Rizaldhan Rayes.First of all, congratulations! This is a huge milestone. As a large project with a massive user base, I bet that you often feel overwhelmed because people expect so much of you. Some want Inkscape to become an animation tool, others want more CAD features (with constraints, no less), and the list goes on. But Inkscape started out with a mission to become THE editor of SVG files. I mean, even the version numbering scheme used to reflect the coverage of the W3C specification, like v0.50 for 50% coverage etc. (this never happened, I think). Now, the homepage doesn’t even mention the word “SVG”. So how do you actually market Inkscape these days?Bryce Harrington: After we’d released a few versions of Inkscape, one of our users posted a drawing of a glowingly photorealistic car it’s the one you’ve probably seen on the Wikipedia page for Inkscape. Seeing how our users were employing Inkscape made many of us realize the scope for Inkscape was incredibly broad.Along with that, users appeared who hadn’t heard of SVG prior to Inkscape. SVG was just a file format, one of several possible ones they cared about. These users viewed Inkscape not as an “SVG editor” but something that solved even more general drawing needs. In hindsight, there was a point in time where Inkscape did in fact fulfil that mission to become the de facto standard SVG editor. There really weren’t many options for GUI SVG editing back then, so as Inkscape developed it became the go-to for SVG file authoring. Even today, from laser engravers to game developers you see Inkscape referenced as an important tool. But at the same time there has grown a whole ecosystem of other SVG editing and viewing tools for various use cases.Marc Jeanmougin: I tend to think of Inkscape as a multi-purpose vector graphics tool. It’s not (yet) THE tool for some precise vector goal, but it’s a good tool to get into vector graphics, and a good tool to achieve almost anything in this area (and the improvements we try to make into customizations possibilities will make it easier to have Inkscape be THE tool for specific goals). For me, the use of SVG is the only obvious choice of an open, hand-editable, web-friendly file format, and we want to insist more on being a vector tool than on technical discussion."Ryan Gorley: Inkscape will ultimately end up at the intersection of well-documented, regularly requested features and what our developers are capable and interested in building. In this way we’re not like a software enterprise that is looking for growth markets and planning features that way. We have been somewhat constrained by what is possible within the scope of the SVG spec, but that may not always be the case, and there is still a lot we can do within the scope of SVG. Ultimately we will go where our users help us go by getting involved. In fact, how much are you willing to modify the spec by introducing a further superset of SVG features to add features missing in the original spec? Do you have any kind of policy or informal agreement on that?Marc Jeanmougin: Not more than strictly needed. There is a lot that can be done with SVG, and ideally we would only implement things that should be in the specifications. Of recent additions, only meshes are not doable in other ways, and not supported in other viewers. Many people (including many Wikipedia contributors) rely on Inkscape to create standard files that would look good on Wikipedia, and it would be detrimental to the SVG format and to Inkscape to add features that get in the way of this use.Bryce Harrington: Your question has been top of mind in the project’s strategy since before it even existed. SVG, like any protocol, has constraints; particularly for people that view Inkscape as more than just an SVG editor, the limitations felt unnecessarily arbitrary to abide by. Yet, sticking to the standard, despite its idiosyncracies, was in fact one of the rationale for establishing Inkscape, and that differentiated it from other vector graphics tools.For these reasons, the Inkscape Project viewed the W3C’s SVG efforts as crucial to the software’s future. We consciously sought to address the limitations by working within the committe. Indeed, thanks to generous donations from thousands of Inkscape supporters, we were able to attend and participate in many of the W3C SVG working group meetings. We were one of the only all-volunteer organizations that did this something this community should be proud of making possible. Tav, our representative in this committee, authored and promoted a number of important SVG features, several of which are now enshrined in the lastest SVG version.One of the most remarkable benefits is that Inkscape can be used to author content that is then loaded into other tools or browsers. Inkscape takes seriously its role as a partner in a much larger ecosystem of FOSS graphics programs. Being able to interoperate with other software makes the ecosystem as a whole stronger and more powerful.Tavmjong Bah: SVG is a dialect of XML and the wonderful thing about XML is that it is eXtensible. We don’t need to break SVG to implement cool things like Live Path Effects (which live in the Inkscape namespace). Multi-page SVG’s would be easy to handle this way. The situation with the development of SVG didn’t look very good a few years back. What’s your outlook for the standard? As a project that is an obvious stakeholder, how much do you still participate in the W3C SVG working group?Tavmjong Bah: The SVG working group still exists and the W3C management is committed to supporting it… although what that support ends up being, we’ve yet to see. Bi-weekly meetings are attended by only a few members and progress is slow. One hopeful thing is that there is quite a bit of discussion going on at the Git repository. Some features, important to Inkscape like mesh gradients have been postponed to post SVG 2.Bryce Harrington: Fortunately, SVG is designed to be extensible, so at least the “Inkscape SVG” format can continue to be improved for Inkscape users. The problem will be loss of compatibility with other SVG-based tools such as web browsers. It’s unclear what could be done to better resolve this, hopefully smart people come up with some good ideas.Martin Owens: I’d say we’re at the point of supporting SVG as much as possible, but we’ve mostly given up trying to add editing features to the SVG specification. As the W3C is dominated by web browsers who don’t need multi page or connectors.I dare not say much more about W3C-specific things. I know that I’m personally disappointed that Inkscape’s considerable importance in the SVG creation space does not lend itself to getting the feature we intend to build into Inkscape into the actual SVG specification. This does lead to the problem that going forwards we’re likely to have browser incompatibilities (this about what a browser will show of a multi page svg). But this isn’t Inkscape’s official position, just my thoughts.Inkscape v1.0 has quite a few new features, and the release notes deal with that beautifully. But a lot of work was done under the hood. Could you please summarize those changes and what they mean for the future development of the program?Alex Valavanis: “My main work at the Boston hackfest, which was partially released in v1.0, was to move us away from the deprecated GtkAction API. Essentially, this was a case of separating code that deals with Actions (things that Inkscape can do) from the UI (the controls that make Inkscape do those things).There are a few reasons why we needed to do this. First, GtkAction has been removed from Gtk+ 4, so it’s important for future-proofing. Second, and more importantly, it brings a lot of flexibility to the way Inkscape can be used and customised in future releases. For example, custom UIs could be developed to tailor the Inkscape interface to particular applications, without having to touch the C++ code: think a bold, simplified “Inkscape for kids”, or a “Technical Inkscape” that eases the workflow for CAD/scientific illustration.There has also been a lot of work on gradually migrating towards more modern C++ standards, which make the code safer, more reusable, flexible, and simpler to maintain. This has been accompanied by better use of continuous-integration tools to reduce bugs being introduced into the code-base. As a result, users should see fewer crashes, regressions, memory leaks and annoying Glib warnings!Marc Jeanmougin: As for “under the hood”, I think the main thing that has most changed in Inkscape development compared to a few years back is our move to Gitlab. It has made it easier to discuss changes and technical progress, identify some regressions with continuous integration, and made the life of new contributors much easier.In terms of Inkscape itself, GTK changes, even if very frustrating, should lead to long term benefits, and IMO the overall code quality is improving over time, which means fewer bugs in the long run.Apart from missing features, I think, two major complaints people still have with Inkscape are various UI issues and performance. UI/UX typically gets better when professionals get involved. As for performance, I’m guessing, some of that can be solved by better algorithms, and some could be improved by GPU-side rendering. Do you have a plan for any of that?Marc Jeanmougin: Yes, and yes. Our community structure and the use of chat.inkscape.org make it easier for UX professionals to get involved in improvement discussions and for developers to contact them (rather than the IRC channel [that I still think is better in lots of ways, but it’s hard to convince non-developers of that]) and we already have examples of UX people coming to discuss some parts of the interface. As for performance, there are three big works planned:many performance problems are caused by our signals (what should be triggered when things happen in the interface), and some developers are planning to look into thatGTK3 is a mess in terms of performance, and we are affected by its quirks a lot. Ideally, GTK4 should use GPU for the interface, and there are chances it will only improve.Making canvas rendering on GPU is a long-term goal we have, and we have a GSoC student this year working in exactly that, trying to get Pathfinder (a Mozilla GPU renderer) into Inkscape.Bryce Harrington: What gets done depends on who shows up. In all-volunteer-driven organizations, the flow of value isn’t measured in money but rather in the time and effort that people put in. And those volunteers are motivated not by increased product sales, but by more personal and intrinsic reasons. As a group, we’ve had discussions, plans, and ideas on how to increase performance and improve UI/UX. You can trust that there is no shortage of thinking here. But ultimately it depends on what kinds of volunteers show up, what their personal itches are, and how much energy they can commit.You can look at the Mac port as a perfect example of this in practice. The poor Mac support was the biggest complaint we heard for Inkscape, yet none of us were Mac developers (or even users). But the desire was so strong that we were able to engage a number of volunteer Mac users to get involved, figure out solutions to the problems, and make it happen. This group is still not perfectly satisfied with the port, but they’ll keep hammering on it.At the Boston hackfest, we discussed plans for reworking the internals to have a better structure between the “frontend” and “backend”, and for 1.0 a big chunk of this infrastructure work was achieved. However, ultimately it’ll depend on new volunteers to bring their ideas and muscle.Ryan Gorley: One small way we’ve tried to attract more contributions on the UI/UX front was by creating a dedicated channel for these efforts on our team chat. That has prompted some good conversations, but we can certainly do more.I have found that our developers are very receptive to input, but it is not always transparent to a newcomer who is working on what part of the project. I think we could do better helping make introductions. I’d love to see more UI/UX people participating in our hackfests.The team of developers has changed quite a bit in the past several years. Last year at LGM (where you had a coding sprint), I saw quite a few new faces. In terms of how the project (as people) is structured, how much has changed? Do you still have developers who can hack on virtually any part of the source code, or do you now have more people who specialize in something? Like, the way Jabier is your LPE guy.Bryce Harrington: The project composition has indeed changed dramatically. Many of us have gotten busy with other aspects of our professional and personal lives, leaving less time to volunteer. Burnout can be a problem, too.But Inkscape was established with an egalitarian structure, that’s larger than any one person. This is why value is placed on making it inviting to new contributors we have no idea who next year’s Inkscape maestro will be maybe it could even be you? So, we love seeing new faces, and are understanding that old timers sometimes need to step back.Structurally, one of the biggest changes has been establishing sub-teams within the project, allowing groups to specialize and optimise their skillsets. The project is really too big for one person to have a hand in everything, and the days are gone where every Inkscaper was knowledgeable in everything. The narrowed focus of these teams also makes the project more approachable by outsiders, who might find the project as a whole overwhelming. Marc Jeanmougin: Personally, I’d like to introduce a concept like code owners, where some people would be supervising parts of the code (like LPE, GTK/UI, rendering, text, etc.), but that’s for future duscussions and we have a few people able to hack on any part of the code.Tavmjong Bah: Inkscape’s codebase is very large and takes quite a while to learn (and I am still learning about it all the time). Some of the code is very complex, like the text rendering part… and it’s very hard to get right. Still, there are areas where newcomers can get involved and make a difference. As some people move on, others come to take their place.One thing you were planning to do a while ago was to start doing paid development. I think (but not sure) the idea was to use the donations you get to pay people to work on Inkscape. But right now, it seems that some of Inkscape developers (Tav, Marc, maybe others) still have personal campaigns at Patreon. Have you abandoned that plan?Bryce Harrington: This is another topic worthy of an entire article itself, and another one with no shortage of ideas.There’s essentially two problems. First, donations to Inkscape aren’t quite large enough to reliably cover the annual salary for a developer. Inkscape could really use people with a passion for fundraising to help us solve this finding a volunteer for this would be a game changer for the project. Second is administrative setting up contracts for hiring, and overseeing their work tasks.These hurdles are not insurmountable, and Inkscape’s taken some experimental shots at solving them, but paid development is a tricky proposition to get right. And given the wider economic disruptions of the world today, it seems unreasonable to ask users to increase donations to a level that Inkscape would need to sustain development financially. Hopefully the idea can be revisited in the future, but for now it’s probably best to focus on enhancing our volunteer base. Marc Jeanmougin: I’m not sure what to answer here… I still think having full-time people on the project would benefit it, but it would be a big step for the project and it’s always very hard to commit into this.Ryan Gorley: There is still a lot of sincere disagreement within the project about how we would best go about paid development. The Patreon approach is one of many that have been discussed, but none have been officially advanced by the project. I know that will continue to be a topic of discussion because we’d all like to bring our users the features they want and reward our contributors for the excellent work they do.Are you comfortable with the development pace? If you wanted to make the dev cycle shorter and thus deliver updates at a faster pace, what kind of project changes would you introduce?Marc Jeanmougin: I think we should more frequent updates, maybe with a list of specific things we’d want for the next release be decided shortly after a release…Ryan Gorley: I know that smaller, more frequent hackfests have been discussed.Bryce Harrington: ‘Release early, release often’. I’d love to see faster, smaller development cycles. When things go too long between releases, it takes extra long to stabilize, and the longer between releases, the higher expectations are on each release. However, release engineering is hard work.The conversion to GTK3 was a special case that really required a longer than normal development cycle. I hope that with that in the rear view mirror, that releases could be faster and lighter, but we’ll have to see how things go.Every software project has something that needs doing, whether it’s missing features or some website work, or social media wrangling. What kind of skills would you say are in greatest demand in Inkscape?Bryce Harrington: The build up to 1.0 involved higher attention to stabilization work, but with this release completed, work can shift to meatier development work. This would be a great time for new developers to get involved, learn the codebase, and undertake some of these more ambitious ideas.The developers are already discussing updating the codebase to C++17, and anyone interested in polishing up their C++ skills might find that a great way to learn the codebase, help the project, and improve development skills at the same time.Bug work (both triage and fixing) will always be crucially important areas where contributors will always be needed and heartily welcomed.As mentioned above, fundraising has come up as an area where demand exceeds capacity by a fair shot. Even a small amount of help in this area could push Inkscape to sustainability.Marc Jeanmougin: Not sure, we need everything :) Is there any skill description for someone able to manage a release, monitor bugs, fix them, and coordinate with everyone while doing so? Martin Owens: As for skills related to website work, we need Django (Python) developers. But more than that, we need sysadmins to look after some of the infrastructure. Some people who won’t just pop in and out now and then, but people who can commit to helping out in the long term. People with experence and skill that can prevent the kind of issues we had with downloads during the launch. Or help us put together new infrastructure. Programmers are great, but honestly, infra needs sysadmins.Ryan Gorley: I think there is enough to be done that nearly anyone could make a difference if they expressed their willingness and stuck around long enough to find the need that is a good fit for them. Everything is connected. Yes, more developers would be great, but someone willing to document bugs or triage new issues makes life easier for the developers we have.We’re still testing and migrating bugs from our old bug tracker to Gitlab, which is something any Inkscape user to get involved with. The skill most in demand is the skill whoever reading this has. Come get involved.
You'd have to be crazy to rewrite GIMP in anything Interview with Ell
There aren’t many things as empowering as seeing a seemingly random person starting to hack on a free software project with a huge codebase and quickly grow into a major contributor that suddenly everyone listens to. So I listened some more. This is an interview with Ell, one of the several core GIMP developers.Before you dive in, I owe you a disclaimer that I’m a GIMP contributor. So you should always treat whatever I write about this project with no small measure of skepticism.Oded Nuriel never really intended to become an almost anonymous GIMP contributor of whom, until fairly recently, we only knew the nickname (Ell), gender, and country of residence. It just happened. So this is your chance to get to know him a little more, too.The screenshots are that of various features he added over the past 3+ years, from the first one to more recent ones.Everyone starts somehow in a software project. What was your unscratchable itch that got you to begin writing code for GIMP?My first contribution was the “Diagonal Neighbors” option of the Fuzzy Select tool (AKA the “magic wand”). As much as I’d like to say I needed it for creative reasons (that would make for a better story), I was trying to analyze images for a programming challenge. This involved selecting narrow curves, made up of pixels that are connected either orthogonally (left, right, up, and down) or diagonally. The Fuzzy Select tool could only select orthogonally-connected pixels at the time. Not having any luck finding another program that could do that, it seemed like a good excuse to check out the GIMP codebase I needed a break, and it’s always fun to look under the hood of the things we usually take for granted.Fortunately, adding an option to select diagonally-connected pixels was the perfect task for getting my feet wet: simple, self-contained, and even marginally useful! I sent out a patch and didn’t expect for there to be much more to the story, but here I am.Oh, so what’s your primary use for GIMP?I’m a casual artist at best; I do use GIMP to edit images, of course, but my real canvas is the code. For me, GIMP is a creative outlet to try new stuff, learn new things, or just kill time coding something.I think, a year or two after you joined, I started noticing that mitch [Michael Natterer, GIMP maintainer], who has been around for almost 20 years, started consulting with you on this or that. Meanwhile, I keep hearing how writing C is horrible and too low-level, how badly structured and documented GIMP’s source code is etc. And I’m personally guilty of referring to 1+ mln LOC of GIMP/GEGL code as a really high barrier for newcomers :) But you managed to fit right in. What’s up with that? Do you think C is a really a bad choice for a project like GIMP? Is the LOC count the death of GIMP?You know, I don’t think I’ve ever heard anyone comment on the quality of the code. We bash it all the time among ourselves, but a lot of it is just banter. I’d actually love to hear that! When you first start publishing your code, your main concern is that someone will actually look at it and criticize it. You very quickly learn that you should be much more worried that no one would bother.Anyway, I digress. I don’t think that the code is badly structured. Some parts of it are in worse shape than others (and few are downright ugly), but this is unavoidable in a codebase this large, written over a long period of time, by many different people. For the most part, though, I think the code is relatively well organized, straightforward, and easy to hack on. I’ve seen much worse! C is a low-level language, but this is the level of control software like GIMP needs image editing is a resource-intensive task. It does pose a higher barrier to participation than some higher-level languages would, but it’s hardly a controversial choice: it’s still a popular language that isn’t going away any time soon. It’s not necessarily my favorite language, but it’s a lingua-franca when working at this level.The death of GIMP… I don’t think it’ll be the LOC alone a large codebase is sustainable if you can process it in small chunks but it does make GIMP more vulnerable to changing circumstances. GIMP will die when the effort required to maintain it stops being worth it. We’re already seeing the strain that the GTK3 transition is incurring, and it’s bound to repeat itself. One day, it’ll stop being worth it.A while back, I heard you jokingly saying smth about rewriting GIMP in Rust in 2020. I did laugh it off first, and then I was, like, okay, but this is Ell. IS THERE MORE TO THIS?! :D But seriously, you already introduced teeny-tiny parts of C++ code to GIMP for async stuff like paint dabs. Do you think GIMP would benefit from an even bigger rewrite?Oh, you better believe it! GIMP in Rust 2020 it’s happening baby!Seriously, though, you’d have to be crazy to rewrite GIMP in anything. The biggest advantage GIMP has is that it already exists. We already have a foundation to build on, so that we can focus on the more interesting, rewarding, and important things. This sometimes involves rewriting parts of the code (sometimes even in a different language!), but it’s usually a focused effort in service of a more tangible goal. Moreover, GIMP is far from perfect, and everyone involved is very much aware of that. It has technical debt and false starts. There’s a lot to learn from GIMP about how to write an image editor it has some very clever ideas that can easily go overlooked when only looking from the outside but I don’t think it’s worth replicating one-for-one. If you’re going to make changes on such a large scale, you’re better off writing something new altogether.One argument I sometimes hear from GIMP hat… oh well, GIMP critics, should I say?, is that the team has no clue about image formats, or color science, or image processing algorithms. We can argue all night long whether it’s true. But, in your experience, how much specific knowledge does one really need to know to start contributing to any image editor or writing a new one? How much do you personally learn as you go and what’s your process?GIMP is a big program. Most of the “exciting” work the one involving more specialized knowledge happens “at the leaves”: at the periphery, rather than at the core. The bulk of the work focuses on the rest of the tree, on providing structure and connecting it all together. In other words, there is plenty you can do even without deep knowledge of image processing. Some knowledge of the basic concepts is necessary to make sense of the code, or to write your own, but none of it is too involved.Personally, I try to find a balance between working on exciting things, and more mundane things (a necessary evil). I try looking for things I haven’t done before, so there’s a fair bit of learning along the way.The process is not always the same: sometimes I take a stab at something without doing any research first, just to see what I can come up with, and only look at literature, or other software, once I’m more-or-less happy with what I have. Other times, it’s more about trying out something I read or saw somewhere.From your standpoint, what does it really take to start hacking on GIMP? I know the team usually recommends taking a stab at low hanging fruits like bugs that are easy to fix or small features that are easy to add. What’s your recipe?It’s not an easy question. I think a lot of people fail to realize that finding something that’s worth working on is part of the challenge. It’s pretty common for people who want to start contributing to ask us “what should I work on?”. Hell, if I had a good answer to give you off the top of my head something interesting enough to be engaging, valuable enough to be rewarding, and self-contained enough that you could actually finish it I’d be working on it myself! I actually don’t think the bug tracker is a very good place to look for project ideas. There’s usually a reason why bugs remain open for as long as they do: they’re either not as simple as they look, or not very interesting (or both!). Bugs in particular, rather than features, are also a riskier investment, because they face a higher level of scrutiny (for better or worse): we want a fix to be right, which can require a certain level of familiarity with the code; new features, especially self-contained ones, get a more liberal treatment.A self-contained feature that you have a personal interest in really does seem to be the way to go; it’s how I got started. Either way, be prepared to see the work through yourself. I hope this doesn’t come off as cynical collaboration does happen, and we do love to help out (even if we don’t always have the time or energy) but ultimately it’s up to you. Paradoxically, the more willing you are to do things on your own, the more likely will others be to want a piece of the action.Another thing I’ve heard from GIMP critics is that the Dashboard dock is definitely useless for end-users and probably useless for developers. It might very well be the case that we didn’t explain this feature to users well enough (references to a similar feature in Photoshop clearly didn’t work). You’ve been working on it for almost 2 years now, adding small new features etc. What use cases have you developed for the Dashboard over time?The Dashboard isn’t meant to be used by everyone, but it can be useful for technically-minded users from time to time, for the same reason your operating system or even your browser have a resource monitor. It’s not something you normally use, but occasionally this information can be useful. GIMP is a resource-intensive program, and it can quickly gobble up memory when working on big projects; being able to keep an eye on it can come in handy. As for development, I can’t speak for everyone else, but I use it all the time. In fact, I keep a whole pane dedicated to the Dashboard always open. My initial use for the Dashboard was hunting down a memory leak, and it still gives me a way to see if anything unusual is going on under the hood, even when I’m not specifically looking for it.Performance logs, which record an execution profile of the program, turned out even more useful. Pretty much every performance-related change I’m making these days is guided by performance logs mostly ones I record myself, but we also get logs from users. There have been some important performance improvements as a result of users sending us performance logs, especially for environments we don’t normally use ourselves.There is a strong argument from users that GIMP and GEGL do not make use of AVX/AVX2 at all (and babl only recently started doing some AVX), which badly affects processing speed. Given that, these days, you seem to be the performance guy in the team, what’s your perspective on that?It’s true that there’s performance to be gained here, but only up to a point. The big-impact changes happen at a higher level: better algorithms, better design, better utilization of resources. Writing hand-crafted code paths that make use of extensions like AVX is specially worth it for code that runs very frequently, like certain babl conversions. It’s not the only example, of course, but don’t expect to see it happening all over the place we have bigger fish to fry.If there’s one form of acceleration that could really make a difference, and that we don’t properly utilize, it’s the GPU. GEGL does have OpenCL support, but it’s not in great shape, and it’s not pervasive enough.Soon after you joined, you were working on some kickass region selection improvement code, then you lost all of that because you hadn’t pushed it to upstream Git, and your hard drive went kaput. Are you still living on the edge? :)I’d love to tell you that I learned my lesson, and that I religiously make backups now, but… Hey, I did go full SSD since then, so there’s that! Of course, nowadays it’s the norm anyway, so I guess I’m back to square one.I do plan to get back to the selection-tool improvements eventually, but since it’s something I’ve already done once (well, technically), it’s on the back burner. You’ve tied quite a few loose ends in GIMP over the past three years. Some of that stuff has been on the blockers list for users for well over a decade. I’d like to think you are just warming up :) So what represents the ultimate challenge for you feature- or performance-wise?Wow, I don’t think I have an “ultimate challenge” I can single out, just a bunch of bigger, more challenging tasks I’d like to work on, or see happen in general. Most of it is “the usual”: non-destructive editing, vector layers, etc. Some of it is a bit more specific: I have a long-standing plan for a new native file format that’s more suitable for big projects, enabling things like background- and auto-save, backups, and snapshots.Performance-wise, better GPU utilization is definitely interesting. The more likely next big performance boost, though, is processing the image at a reduced level of detail when zoomed out, which has the potential to make a very big difference for large images.On a more ambitious level, it’d be interesting to venture into new territory: more powerful vector tools, maybe even 3D and animation. The inroads machine-learning is making on image processing is also hard to ignore, and I’d like to get in on that. In the meantime, though, we have our work cut out for us.
Double-week highlights: major new releases of darktable, FontForge, LuxCoreRender/BlendLuxCore, and LilyPond, new release candidates of Natron and Dust3D, first milestone for VR support in Blender is done, enve is becoming SVG animation authoring tool. Graphics GIMP developers are picking up speed after the last major update. Now that RawTherapee reads Canon CR3 files, the file extension is also recognized by the program so that you would be able to use RT as a file loader in GIMP for these files. Layer group previews are now rendered a lot faster, and the word on the street is that multiple layers selection is about to land pretty soon.The Krita team fixed a lot of bugs since releasing v4.2.9 two weeks ago and finally merged the resource management rewrite into the main development branch. Among other changes support for JFIF files and support for HEIF/HEIC on all platforms. You can also help Kuntal Majumder improve the Magnetic Lasso tool.The FontForge team made a new release with new features and bug fixes, the first one since August 2019. This probably deserves a separate coverage. UX There s a lot of activity in the AkiraUX project lately. Alessandro Castellani recently did a live session on the latest changes. Photography The darktable team released version 3.0.1 with quite a few major changes, contrary to the version number:New color assessment module in the darkroomFocus peaking in both lighttable and darkroom modesEasy resize of the sidebars (at long, long, LONG last!)Support for compressed LUT (.gmz) files in the 3D LUT moduleSee the release announcement for more info. Oh, and if you want fun, Heiko Bauke is working on a GMIC module for darktable (uses an older database schema last time I checked, so you need a separate database right now).The Siril team finally merged the float branch to the main development branch, so 16/32-bit float precision editing is officially going to be part of v1.0. Animation Synfig developers ended up disabling background rendering by default until they figure out how to make it work reliably with complex artworks.I can list all sorts of recent improvements in enve all I like (e.g. GPU-side motion blur is cool), but the real game changer is that the program is now pretty much t_he first SVG animation authoring software for actual artists_ out there. Yes, you already could do nice things with Inkscape and JavaScript. No, this is a whole different game. You can now preview the resulting SVG file before exporting it. pic.twitter.com/1T2S2OPYRQ— enve (@enve2d) March 23, 2020 3D A really huge recent change in Blender is the first stage of VR support available in the master branch now. I'm STUNNED. I downloaded today's build from https://t.co/bb3Md9z3jx, launched it via "blender_oculus.cmd" on Windows, enabled the VR addon, and imported glTF's Damaged Helmet. It displayed in my Oculus headset on the first try. AMAZING job here. Go @OpenXR! https://t.co/s8sIERZK6o— Ed Mackey (@emackey) March 18, 2020A few important notes from Julian Eisel:To be clear: This is not a feature rich VR implementation, it’s focused on the initial scene inspection use case. We intentionally focused on that, further features like controller support are part of the next milestone Currently Windows Mixed Reality and Oculus devices are usable. Valve/HTC headsets don’t support the OpenXR standard yet and hence, do not work with this implementation.The Blender team posted an article outlining both ongoing and planned work on asset management. The post introduces basic concepts and provides a roadmap for the development of this feature. Meanwhile, a new release date for Blender 2.83 is set to May 20.Because of the quarantine, Blender devs are staying at home, so Pablo decided to make the best of it and switched to daily live streaming. Here is the first one with the Grease Pencil team. You probably also don t want to miss a huge online Blender meetup featuring, and I quote, sessions from Epic Games, Character Mill, Tangent Labs, Bone Studio, Red Cartel, Theory Studios, BlueStone Institute of Design, BlenderDiplom, Crowdrender, Los Angeles Blender Users Group (LA.blend), Vancouver Blender User Group, Blender 3D Meetup London, Sydney Blender Users Group (SBUG), Melbourne Blender Society, Seattle Blender Users Group (SEABUG), etc. Visual Components announced they developed a free Blender add-on for their 3D manufacturing simulation software (Windows-only, it seems). It s somewhat puzzling though that they shot an entire promo clip about the add-on without one single frame featuring Blender interface. The BoxCutter add-on has been updated (affiliate link, feel free to strip the id). You can now extract any surface using surface extraction. Everyone s having so much fun with EEVEE these days that Cycles seems to be taking the back seat, and it almost feels like LuxCore and Yafaray are falling off the face of the Earth. Well, not anymore. LuxCoreRender v2.3 and BlendLuxCore v2.3 were recently released. You can read the list of changes here, and I ll just say this: it s reassuring to see developers making the use of OpenSubDiv and OpenVDB.Jeremy Hu made another release candidate of Dust3D v1.0. You can fetch it at GitHub. New feature preview of #Dust3D: IBL shader is been fixed. You will be able to view the PBR/metallic material which wasn't showing properly. Will be released soon. pic.twitter.com/AWNi2dHB7c— Jeremy HU (@jeremyhu2016) March 18, 2020 VFX New release candidate of Natron is available with fixes and small improvements, among those: support for reading and writing HEIF/HEIC images, GMIC update, SRT subtitle format support, and more. Grab it on GitHub. Video Pitivi has joined the club of programs where actions can be searched. For that, the program uses the same shortcut as GIMP and Olive: the slash (/) key.MattKC continues rewriting Olive and adding back features you already know from the stable series: audio waveform display, markers, basic zoom in the viewer etc. No fancy new features, just basic stuff. Music and audio I don t really follow the development of LilyPond much, but Francisco Vila was kind enough to drop by to tell me they recently made a major release, v2.20, with a ton of changes. If score engraving in a text editor is your thing, definitely check it out!There s a lot of boring under-the-hood development going on with Ardour lately, so I kind of hesitate to mention it. Nevertheless, it s not just new features that we need, right? Well, Paul recently fixed MTC and LTC support, and Robin keeps expanding the API coverage for Lua scripts.On the other hand, feature-wise, there s always some good news from VCV Rack. Rack v2 and Rack for DAWs development back up to full speed!Bypassing a module allows a signal to pass through the module's inputs to the most logical outputs specified by the module programmer, such as VCV VCF's audio input to LP/HP outputs. pic.twitter.com/EH4qwr0dqL— VCV (@vcvrack) March 21, 2020 Tutorials Great new tutorial about sculpting hair in Blender on the YanSculpts channel: Zakey Design, Inkscape tutorial on creating a badge logo: Krita painting timelapse by Alartriss (you really want subscribing to their YT channel, it s full of awesome): Artworks Cheryl Chen posted new artwork, Giyuu , made with Blender:e Sci-Fi render by Rachel, made with Blender: A very nice way to introduce yourself in a group of artists, by Dana Codermak, made with Krita: Artwork for a graphic novel, by Everette Arkitect Brown, Krita: Not a bad way to test new version of Krita, by Andreas Raninger: A postcard-style landscape, also made with Krita, by Matt French:
Week highlights: new releases of OpenShot, Kdenlive, and Cinelerra-GG, first beta of Krita 4.2.9, new module in darktable, major Blender news, SIFT patent expired to open the door for autopano-sift-c inclusion to Hugin for generating control points in source panorama images. Graphics GIMP and MyPaint teams are basically taking rest after the stress that accompanies recent major updates.The Krita team has just released the first beta of v4.2.9 with several new features and a ton of fixes. Ramon Miranda shot a video about some of the new stuff. By the way, Agata Cacko shared her thoughts about working full-time on Krita for the past year. It s an interesting perspective well worth the read. Photography The next release of darktable is likely to feature a new module for film inversion algorithm based on Kodak Cineon densitometry model that simulates a print from a digitized film negative. Here is a video by Aur lien Pierre: You can also help him by completing a user survey.Another interesting bit of news is that the SIFT patent expired on March 8. This affects a few projects out there, including OpenCV and Hugin. In fact, the Hugin team encouraged a whole Google Summer of Code project in 2007 (done by Zoran Mesec) devoted to creating a patent-free feature matcher (called Matchpoint) to work around that. Hugin now defaults to using CPFind. The news means they can start shipping autopano-sift-c with Hugin by default.In a Twitter conversation regarding the news, Bruno Postle (another Hugin/panotools contributor) mentioned that CPFind and SIFT are comparable quality-wise, but some SIFT implementations are faster than CPFind. Animation Maurycy Liebner is now making JavaScript first-class citizen in enve. Expressions are now JavaScript-based, and you can also use JS with ShaderEffects. There s some documentation available. After the release of v1.4.0, OpenToonz developers are now incorporating fixes and small new features into the main development branch. E.g. most recently, the program got snapping at intersection option in the cutter toolThe Synfig team added a few improvements to the Warp layer making it a lot faster. See their weekly report for details. 3D Two topics dominated last week in the Blender community.First, the team is considering a switch to a different release model. The first proposal is to do one long-term support (LTS) release every year and maintain it for the next two years. The second one is to eventually switch to a more traditional release numbering scheme e.g. 3.0, 3.1, 3.2, 4.0, 4.1 etc.A post on that was published in late February, in fact, but it was considerably amplified by a live session with Pablo Velazquez and Dalai Felinto. The other major bit of news is that Ton Roosendaal has to step down for the duration of a month due to health issues. Don t worry, he s going to be just fine, and both Blender Foundation and Blender Institute are in good hands of Francesco Siddi until April when the recovery is over.Meanwhile, the team did a two-day long workshop on further UI changes and polish, including work on asset management, node types representation (affected by particle nodes, in particular, excuse the pun), multi-item properties editing etc. See here for more info.Oh, and the most recent live session on YouTube covers latest changes, including Grease Pencil refactoring: VFX When I last mentioned Natron in recaps, Frederic Devernay was back to working on the project, fixing bugs and making release candidates. Since then, he moved to Seattle for a new job, however he continues hacking on the project as much as he can. And Ole-Andr Rodlie, another former team member, rejoined the project to help him.Changes vary between UI fixes, small new features, 3rd party component updates, and bug fixes. You can grab the latest release candidate to see for yourself. Oh, and how about JACK transport support? Video OpenShot 2.5.1 is out with various fixes and faster effects. The release was basically triggered by a regression in v2.5.0 that broke support for UTF-8 encoded paths to files (think Cyrillic or Japanese folder names). Major new stuff will come later.The Kdenlive team released a bugfix update, v19.12.3. Most of the changes in the development branch recently are bug fixes, with some small UX improvements like adding Paste effects to clips context menu and saving the full effect stack as an effect. New version of Cinelerra-GG is out with several improvements. Most interesting bits are a new plugin called ColorSpace to convert between BT601, BT709, and BT2020, and PulseAudio becoming an option in Preferences for the sound driver.Since Olive is undergoing full rewrite, quite a few recent changes happen to be clean reimplementations of features that already existed in the past:Ripple deleteBasic image sequence importingCompression options for H264 exportingAbility to cancel exportsBasic waveform display support for audio tracksOther changes include keyboard controls for inserting/overwriting footage into the timeline and node view optimizations. I know, it all sounds a little boring, but this is a good kind of boring. Tutorials Nick Saporito, vintage text posters with Inkscape: Ducky 3D, making a photorealistic concrete material with cracks in Blender 2.82: FlyCat, a timelapse of creating stylized Logan in Blender: Pallab Biswas, a timelapse of painting a night sky with Krita: Another Krita timelapse, this time by AnaM Draws: Art Sometimes, I feel like this section is Weekly Krita Artwork Recap :) Which is, in fact, awesome. Mere 5 years ago, I would have to look hard for great new art made with this program. There are so many more users these days!David Revoy, Dawn , painted with Krita: Philipp Urlich, On a journey , painted with Krita: By the way, Philipp is on Patreon now.AlbertWeand, Nest , painted with Krita: Sad_Tea, The valley of waterfalls , also painted with Krita: Sven Ebert, Sid the Squid , illustrated with Inkscape:
Highlights: major releases by Blender, MyPaint, GIMP, and MusE teams, coding sprint by Krita devs, first release candidate of Dust3D v1.0, new release of Shotcut, major changes in Olive and enve. Graphics Let s start with major news. MyPaint 2.0.0 is finally out after several years of work. It s now maintained by a whole new team that stepped up when there was no maintainer in the project anymore. They didn t come empty-handed either. Here are some of the new features:Linear compositing and spectral blending (the Pigment blending mode to emulate the use of traditional mediaNew brush settings and input dynamics (attack angle, barrel rotation from Wacom Art Pen etc.)5 new symmetry painting modesNew options in the floodfill tool, including offset, feathering, and gap detectionSee here for release notes, and here is a nice video by Gamesfromscratch. GIMP 2.10.18 was released with multiple UX improvements, faster Photoshop brushes loading, a new 3D Transform tool, and more. I already covered the release extensively in a video (full transcript available), no need to repeat everything. By the way, if you use the MyPaint Brush tool in GIMP, check out this brush pack by SenlinOS, made specifically for GIMP 2.10.The Krita team had another coding sprint in Deventer, this time without Dmitry Kazakov who is, theoretically, on paternity leave. He still pushes lots of fixes and even some new features like the Ratio, Airbrush, and Rate options for the Color Smudge brush.Ivan Yossi, Agata Cacko, and Boudwijn Rempt worked hard on the resources management branch, and L. E. Segovia pushed quite a few fixes all around in the master branch. Another new interesting feature, splitting layer into local selection masks based on color, comes from Saurabh Kumar.Just van Rossum announced FontGoggles, a new open source (Apache License 2.0) font viewer for macOS, with development sponsored by Google. He says, 99% is written in Python, so it should be portable to other systems.Meanwhile, development of FontForge might be stuck to a certain degree. Frederick Brennan, one of the most active contributors to the project, left Manila in a hurry after a warrant for his arrest was issued. This appears to be the result of his most recent feud with Jim Watkins, current owner of 8kun (formerly 8chan). Libel is a criminal offence in the Philippines, and while fleeing from arrest doesn t seem like a good idea generally, Frederick is severely disabled, and the sentence for the offense could be up to 12 years. See the coverage by Vice for more info. Animation Morevna project released their build of OpenToonz 1.4.0 for Linux, Windows, and macOS. It comes with a few extras, namely, Krita-like drawing assistants and advanced color selector (both submitted to the upstream OpenToonz project). It also doesn’t come with DSLR support that the upstream project has.Maurycy Liebner introduced a ton of changes to enve. Some of the changes are:libmypaint 1.5.0 support with support for paint modes (erase, lock alpha, colorize)support for moving and cropping the paint canvassupport for creating paint objects from image objectsimporting of ORA (OpenRaster) and KRA (Krita) files, linking of SVG filesmemory handling improvements - Paint modes (erase, lock alpha, colorize).- Moving paint canvas.- Sculpt path nodes visibility switch. pic.twitter.com/mRviyBFov4— enve (@enve2d) February 18, 2020Jose Moreno created a new onion skin panel for Pencil 2D. 3D Blender 2.82 was released. Major new features are:New physically-based liquid/gas simulation system using Mantaflow and improvements in cloth physics.New FLIP solver to create lifelike liquidsUDIM, the popular tiled-based UV mapping system, now fully integrated in Blender s pipeline.Pixar USD exportingVarious Cycles improvementsSee here for release notes, and here is a quick video review: The Cloth sculpting brush has been merged to future Blender 2.83. Dust3D 1.0rc1 is out with a new rig generator, remeshing using instant meshes by Wenzel Jakob, a cloth simulating system based on Samer Itani s FastMassSpring implementation, and more. SHADERed 1.3.0, a free shader editor (as is easy to guess right), was released with a bunch of new features, here are some of them:Shader debuggerPlugin APIMultisample anti-aliasing supportOptional supersampling when rendering to an image fileAnd here is an older video intro to the project: CAD One project I completely forgot to mention in the previous recap is Horizon, a free electronic design automation package. After two years in development, v1.0.0 was out in late January. I m not really qualified to give any opinion on this, but it looks interesting enough. Horizon is designed as an end-to-end solution. Quick list of features supported so far:Complete design flow from schematic entry to gerber exportLobrary (pool) managementNetlist-aware schematic editorUnified editor for everything from symbol to boardRule-based DRCHardware accelerated visualisationThe fun part about the project is that it doesn t reinvent everything from scratch but rather reuses good code from other projects. E.g. it uses KiCad's interactive router, QCAD s dxflib, and more projects.There s a Windows installer available, as well as a somewhat outdated flatpak build.You can watch the talk by its developer, Lukas Kramer, at the latest FOSDEM. Video I know I probably bore you with changes in Olive, especially since version 0.2.0 is probably not going to be released any time soon. Heck, some people take this as a sign that the project is not doing well. Strangely enough, Matt is, on the contrary, very much active. Here are some of the changes that landed to Olive over the past couple of weeks:Complete memory management rewriteFunctional exportingUniform scale functional againA lot of plumbing and bugfixingAutomatic creation of a sequence from footage is now backWork towards OSL as a rendererPeople have been giving Olive some stress testing with 2+ hours of real live footage lately, and after some bug fixes it s behaving well. It s going to take a while to complete the rewrite and make a release though.Dan Dennedy released Shotcut v20.02.17. I mentioned most of the changes in past recaps already, but here is a quick summary:New Preview Scaling option for the monitorNew Vectorscope analysis toolPitch audio filterMore exporting presetsScreencasting in GNOME on Wayland is getting better thanks to Georges Stavracas and recent changes in PipeWire (yes, that project is very much alive and kicking). It s likely that the next release of GNOME will have little CPU carnage when recording your screen. Music Just as usual, a lot of improvements and internal rewrites in Ardour. One recent change that stands out is the initial websocket control surface support. Basically, you would be able to control the DAW via your web browser of choice (or tablet?). Exciting stuff!I know I don t follow projects like MusE, Rosegarden, and Qtractor as closely as I probably should, but here is some news: Robert Jonsson released MusE 3.1. The MIDI sequencer got its fair share of updates:MidNam support in LV2A ton of fixes for LV2, LADSPA, DSSI, VST supportVarious UI improvements including better support for HiDPIRobert also mentions that two major new features, realtime time-stretch and latency compensation. turned out to be problematic, so more fixes need to be applied before he can recommend them. See here for release notes and downloads. Tutorials Chris Hildenbrand recently started revamping his old Inkscape tutorials for 2D game design and making new ones. This one is on bows and arrows. Recent new GIMP tutorial by Davies Media Design: New Siril tutorial on pre-processing astrophotography images: New painting timelapse with Krita, by Den Jay. No photo reference in sight either: And just because vector tools in Krita don t get a lot of attention on YT, here is another speedpainting, this time by Proton Kid: Steve Lund, the master of 1-minute Blender tutorials, strikes again. Artworks Taoro, by Philipp Urlich, painted with Krita: Some new concept art by Everette Arkitect Brown, made with Krita: Fro Bro , by Sen Ebert, made with Inkscape: Farmhouse kitchen render made with Blender, by clayton.95: Steampunk airship by MillionthVector:
The new version of GIMP is full of usability fixes, new features, overall improvements, and bug fixes. Let’s dive right in!But first, the usual disclaimer for cases whenever I make a dedicated post about GIMP. Apart from writing for Libre Graphics World, I’m also a non-coding contributor to GIMP, so if you suspect that I could be biased about this program, that’s because I am.For this particular release, I shot a video that covers most of the changes in this version. But since not everyone likes videos, below is the full transcript. Now, as you probably know, starting with the 2.10 series, new features are allowed into stable releases. Which is why we’ve seen major improvements coming to end-users on a regular basis just several months apart. Which is a lot better that having to wait for 4 to 6 years.The new version is especially exciting because it fixes quite a few gripes people had with GIMP. So let’s start with those. Toolbox revamp The first thing you are going to notice immediately is that the toolbox got smaller. That is because tools are now logically grouped by default, just like in some other image editors that you might have used in the past. To open the list of tools, left-click and hold or just right-click. Another, not quite discoverable way to change a tool in a group is to hover the group and then scroll the mouse wheel.Instead of cramming all tools into groups like Selection and Transformation, Ell took a better approach and made several groups for both based on the kind of interaction they rely on. E.g. Free Select, Scissors, and Foreground Select are grouped together, as they have interaction similarities. So are Warp Transform and Cage Transform, as they both are what I would call local transformation tools.The overall effect is that the toolbox looks cleaner, especially when you don’t dock any dialogs below.I know from reading the feedback that people generally cheered on this change, but not everyone agreed. So while these are the new defaults, you can go ahead and customize the hell out of this feature.For that, go to Edit > Preferences > Interface > Toolbox.Now, you can disable this feature entirely by ticking off the checkbox that says “Use tools groups”. Or you can make your own groups and reorganize tools the way you like it by just picking and then dragging tools around. The changes will show up in the toolbox immediately. New docking behavior By the way, if customized GIMP to only show the toolbox at the left side, this message here probably used to annoy you a lot. Well, now it’s gone. And since there had to be some replacement for it to make docking obvious, GIMP will now highlight dockable areas when you start dragging a dockable dialog. New higher-contrast icon theme And then the next change quite a few people might like is a higher-contrast variation of the symbolic icon theme. There have been numerous reports that the new default symbolic icon theme doesn’t have enough contrast. So this is a temporary workaround for people who want a bit more punch in the toolbox.The customization interface is right where it was: on the Icon Theme page of the Preferences dialog. The team is looking forward to improving this further in the GTK3-based version of GIMP where it could be possible to use CSS to customize icon themes, similarly to how it works in upcoming Inkscape 1.0 version. Smoother brush outline motion on the canvas The next change that is very much visible is the new compact style for the sliders.Now, I know I’m a GIMP contributor and I’m supposed to be fighting for all its quirks to the death. But seriously, the old slider sucked so much even though it was actually added to improve the user experience. Which it did, but only up to a point.The vertical separation into small and default increment was not very discoverable for new users and not all that convenient to experienced users. The numeric input field simply got in the way. And the height of the widget was so large that for tools with a lot of options like the Paintbrush, people had to scroll up and down to move around settings all the time.OK, so the new compact style for that widget changes a lot. It’s small, so you get to see more, it allows changing values with default, smaller, and larger increments, and the numeric input is just there for when you need it. This is how the interaction works now.You can just single-click with the left mouse button anywhere to set a new value. That’s not new, it’s how it already worked before.You can left-click and drag to change a value with a default increment.Then you can press Shift and left-click and drag (or just right-click and drag) to change a value with a smaller step. This is something people commonly do when they use a small brush and need to change the size just a little bit.You can press Ctrl and left-click to change a value with a larger step. Like, going from a small brush to a much larger brush very fast.Instead of left-clicking, you can use all these new modifiers with the mouse wheel scroll.Now, about the numeric input. Whenever you click anywhere on the slider, the numeric input mode gets activated. But it doesn’t really get in the way. The blinking text cursor just sits there waiting for your input. Simply type the value and press Enter or Tab to confirm. You can deactivate the numeric input by pressing Escape.There are two more things worth telling here. First, you can middle-click on the slider to just enter the numeric input mode without changing the value. And if you right-click, you both enable the numeric input mode AND select the value which is handy if you want to replace the value completely rather than adjust it.Again, you can disable all this as well. I really have no idea why you would the old gigantic slider back, but you can do it in Preferences. Improvements for global transformation tools preview on the canvas The next series of changes makes global transformation tools way more usable. Composited preview It starts with the new Composited Preview option that essentially fixes a hackaround introduced many years ago. Here’s the problem.When you have multiple layers, each one with its own blending mode and opacity, transforming it means it pops up right above every other layer. So in a complex layers composition you can’t align this layer against other layers without much trial and error.The new Composited Preview option removes this hackaround in favor of rendering the preview of the transformed layer exactly where it is in the layers stack, exactly with the blending mode of choice. Synchronous preview It comes with a suboption called Synchronous preview that is more experimental. The idea is to render the preview as soon as you change the transform. So instead of waiting for the mouse to stop moving, it renders the result immediately. If GIMP can render everything fast enough, this means a much smoother and more instant feedback.But this option also blocks everything until the preview is done rendering. This means, GIMP can become much less responsive, usually when the layer is very large. That’s why this is disabled by default.And then there’s another new option to enable displaying transformation of all linked layers rather than the currently selected one. Again, something people have been expecting to happen for many years. I believe the relevant bug report was filed 16 years ago. New in development: you can now optionally preview the transformation of all linked layers. This partially fixes a bug reported by @jimmac 16 years ago :) Once mipmaps hit GIMP, all these new options will become the expected default behavior. To be released in version 2.10.16. pic.twitter.com/cAlZX9edjJ— GIMP (@GIMP_Official) January 18, 2020 Why they are all optional Now, all that might seem like too many optional features that should be on by default anyway. The reason why these are all options is that these things tend to become sluggish as image size gets larger.All of this will change once the mipmap rendering lands to GIMP.If you are not familiar with the concept, the general idea is to separate what you see on the screen from the actual data. On the screen, you would see a smaller version of the image that you would transform or apply filters to, and in the background the program would silently process the full version of the image. This is quite similar proxies in video editors.This way, working on large images will become a lot faster, and there will be no need to keep this sensible new behavior disabled by default. Clipping preview on the canvas Another change here that is, in fact, not optional is the clipping preview on the canvas. Supposing you rotate a layer. Once the data goes outside the layer boundary, you have several options how GIMP should clip it. ‘Adjust’ is the default option and it means no clipping. It does looks clipped by default. But if you enable the ‘Show all’ option, there it is.‘Clip’ means just keep whatever is inside the layer boundary and drop the rest.‘Crop to result’ will try to find the largest possible rectangle of original data inside this rotated bounding box.And ‘Crop to ratio’ will do the same while maintaining the original ratio between width and height.So what’s different about it is that previously GIMP displayed the cropping only after you applied the transformation. Which means more trial and error. Now you get to see everything while doing the transformation. New 3D Transform tool Another major change is a whole new global transformation tool called 3D Transform. It’s more of a 2.5D transform really but that’s beyond the point.This is going to be helpful for cases when you need to rotate an image as if it was stretching towards a vanishing point. Unless you are trying to align a screenshot over a blank screen in a MacBook Pro mockup and thus have the real perspective reference, using the Perspective tool requires quite a lot of patience.With the new tool, you don’t have to try hard to make the perspective stretch believable. You just switch to the tool, change the vanishing point position, if you need to, and then rotate. Or pan. Or both. That’s all.At your preference, you can either drag sides of a layer on the canvas to rotate like in a 3D modeling software or use numeric input in the on-canvas dialog. The new 3D Transform tool got a 'Unified interaction' option where you can reposition the vanishing point, move, and rotate depending on where you click and drag. No need to switch between modes. The new tool will be part of version 2.10.16. pic.twitter.com/cG2Nx4Gb0q— GIMP (@GIMP_Official) January 11, 2020You can use modifiers to constrain rotation to just one axis of your choice. And once you rotated a layer, you can enable e.g. the Z axis constraint and the ‘Local frame’ option to rotate the layer on the imaginary plane that this layer is now part of.Now, in all honesty, I don’t think that Ell is trying to turn GIMP into a full-fledged 3D modeling software or texture painting software. We already have great free tools like Blender and ArmorPaint for that.However, there are quite a few valid design uses for transforming objects in 3D. Basically, any design that needs believable perspective would greatly benefit from this. Especially if it will become possible to rotate and scale in 3D in one go. Smoother brush outline motion on the canvas The next chunk of updates is for people who use GIMP for drawing and painting. Starting with this version, you should see a much smoother brush outline motion. There are two major reasons for that.One is that several releases ago, Ell, who is one of the most active GIMP developers, separated translating the brush strokes to actual dabs, and updating the display. These are, in fact, two different things.See, when you paint in software, you drag a mouse or a stylus on the canvas, and that sends some information to the brush engine: how fast you draw, in what direction, how much pressure you apply while drawing, whether you rotate or tilt the stylus and so on.The brush engine takes all that into consideration, maps it to various brush settings like size and spacing, and then turns all this data into actual pixels written to the layer that you paint on. Then it updates the display which means showing both the dabs you made and the brush outline. New in development: higher canvas update rate for painting tools makes brush outline motion muchsmoother :) Patch by Ell. To be available in 2.10.16 soon. pic.twitter.com/Qwxm72RbH6— GIMP (@GIMP_Official) February 4, 2020Until fairly recently, this approach meant that the brush engine couldn’t go back to translating your brush strokes into pixels until display update was complete. From users' standpoint, it looked like the software couldn’t keep up with your drawing. And that was really frustrating if not infuriating.So Ell fixed that several releases ago, and now GIMP translates brush strokes into pixels and updates the display separately from each other. The rate at which brush strokes were rendered to the display was already high enough, so the last missing bit was a higher refresh rate for brush outlines. It was literally a one line fix in the source code changing the rate from 20 fps to 120 fps and it made a lot of difference visually.And then there’s a second reason why brush outline motions are now smoother. Before, GIMP used to snap the outline to the center of the last brush dab. It was mostly visible when you used large spacing values. Ell made this snapping an option that is disabled by default. So now painting with large spacing between each dab shouldn’t feel nearly as choppy as before. Faster loading of Photoshop brushes (ABR) For people who use a lot of Photoshop brushes, GIMP will now start a lot faster. Before, GIMP used a very inefficient way to load these brushes. It was fixed just a week before this release.To give you idea, I did a quick test and downloaded a bunch of ABR files, some of them provided by a fellow GIMP user Sevenix.I ended up with 214MB of those brushes. Then I built two versions of GIMP one before the fix and one after the fix. Then I used the performance profiler that Ell created a while ago.This feature writes a log of how much time it takes to do certain operations in the program so that developers could investigate various inefficiencies. In this case, I was interested in how much time it takes to load all brushes at the startup.So here is the performance log viewer showing the first log where there is no fix. As you can see, that’s a little over 28 seconds. Not exactly unbearable, but a really, really long time.Now, here is the data for the version of GIMP where this is fixed. We are looking at a little less than two seconds of time spent parsing and loading ABR brushes. Again, that’s 33 files and 214MB of data. That’s actually not too bad!Of course, you should keep in mind that absolute numbers will vary between different computers depending on the specs. For example, this is a laptop with an SSD. If you have an HDD, these numbers will be quite different. But the magnitude will be about the same.It’s highly likely that Ell will eventually apply the same lazy loading magic to resources such as brushes and patterns that he used for fonts almost two years ago. So GIMP will skip these resources at the startup time, show you the interface, and then load these files. At this point, I can’t say which version of GIMP will have that though. Further UX improvements There are also two small convenience changes. New pivot selector widget First is the new pivot selector widget that you can see on the on-canvas dialog for the Rotation tool.Supposing you want your layer to be rotated around a point that isn’t the center of that layer. Before, if you wanted that pivot point to be, say, the middle of the left side here, you had two options.One was to try really hard to drag the pivot exactly to the middle of the side, which usually means squinting and swearing. The other option was to calculate the position of a guide so that it would go exactly through the middle of the layer, and then drag the pivot to snap to the intersection of the guide and the layer’s boundary.Now you have this little widget where you can click to set the pivot’s position to one of 9 commons presets: the center, one of 4 corners, or one of 4 midpoints. Consolidated UI for merging layers and anchoring floating selections Another convenience change was introduced by a new contributor who calls themselves woob. The Layers dock now displays the anchor button only when there’s a floating selection in your project. When there isn’t one, it will display a button for merging layers instead.This button can also be used with several modifier keys.If you want to merge all layers inside a layer group, select the layer group, press Shift and then click the button.If you want all visible layers to be merged, press Ctrl and click the button. You will see a dialog with some options.And if you want to merge visible layers with the settings that you last used in that dialog, press Ctrl and Alt keys together, then click the button. New release availability checker The last major change in this version, introduced by Jehan Pag s, is the automatic check for new version availability. Now every time you start GIMP, it will check whether a new version of GIMP has been released.All it does is reading a file on GIMP’s server that says what version is the most recent, and then compares that release version to the one currently installed on your computer. If there is an update, it will tell you so.This is handy because there are quite a few users out there who don’t follow the development and miss out on updates bringing bug fixes, usability improvements, and new features.The update checker also looks for new revisions of existing new versions. Here is what it means.GIMP relies on quite a few 3rd party components for things like reading this or that file format. Like any other software, these components are prone to bugs and security issues. So when there is a fix for an issue that affects GIMP users, developers of GIMP typically repackage installers to include the updated 3rd party component. Since no code in GIMP is changed, this does not qualify as a new version.So if you are a Windows or macOS user, GIMP will now tell you if there’s an updated installer available. What’s coming in 2020 Finally, let’s talk about what’s coming next for GIMP.While the team keeps making new 2.10 releases at a steady pace three or four times a year, their real focus is completing the GTK3 port of GIMP and the refactoring. Here is why this is important.Everybody, including developers, wants GIMP to gain non-destructive editing as soon as possible. This is scheduled for version 3.2. It has been this way for quite a few years now, it wasn’t postponed, nothing has really changed.But right now, GIMP relies on the old version of the user interface toolkit. This version has bugs, it isn’t really maintained, and developers of GTK don’t even review patches for that obsolete version of the toolkit. Which is also understandable.Besides, HiDPI displays are now pretty much the norm, and unlike GTK3, the second version of the toolkit doesn’t handle them at all. GIMP developers had to invent clever ways to trick the toolkit into working on HiDPI displays, and it all falls apart once you work on two displays with mismatching resolutions. And that’s just one of several scenarios where things go wrong.So GIMP really needs to move to the newer version of the toolkit to stay maintainable.The other part of the effort is internal refactoring. What it means is that when developers look at the code they wrote years ago, they tend to find mistakes.Now, some of those mistakes are easy to fix, but some are design decisions that seemed like a good idea at the time, but eventually turned out to be wrong.At some point, there are so many bad design decisions that you cannot build new features that will work reliably and perform well.GIMP is 24 years old this year, it’s also around a million lines of source code. So, as you can imagine, refactoring is not an easy job to do.However, it seems that soon enough the first development release will be made. This will lead to several more development releases all essentially making the program more stable.It usually takes two or three development releases for GIMP programmers to get the idea of how much more work has to be done until they can safely cut a release and call it stable.So for 2020, we are most likely looking at two or three more 2.10 releases with new features and bug fixes, and maybe one or two development releases in the 2.99 series. Again, this is just my gut feeling. I could be entirely wrong about that, we shall see.You can download GIMP at the usual place over at gimp.org.
Highlights: new release of RawTherapee, Cinelerra, and OpenShot available, the Krita team is fixing unit tests, there are even more UX fixes in GIMP, there have been major Godot news, the enve developer is unstoppable! Graphics Kuntal Majumder posted another great overview of what s going on with Krita development. You should totally visit his blog for this kind of information. In a nutshell:The team has been busy fixing unit tests so that bugs would be discovered early on.The resource management rewrite is still a work in progress.Switching to Python 3.8 made the packaging of Krita for Windows quite a bit of a nuisance, this might delay the v4.3.0 release.Meanwhile, the GIMP team pushed last major changes before the v2.10.16 release:The much hated You can drop dockable dialogs here" message under the toolbox is gone for good. Instead, GIMP now highlights dockable areas as you start dragging a dockable dialog.Refresh rate for brush outline preview on the canvas has been bumped from 20FPS to 120 FPS. Additionally, the snapping of the outline preview to dabs is now optional and off by default. Both changes result in smooth motion all around.The update check has been ported from the git master branch. From now on, every time GIMP launches, it checks for availability of a newer version and tells you if there is one. This helps users to discover new releases. That is, unless you explicitly forbid it to do so. For Windows and macOS users, it also checks for an updated installer of the same version (happens when a 3rd party component gets a fix for a bad bug). New in development: the dreaded "You can drop dockable dialogs here" message in the toolbox is gone :) Docking areas are now highlighted when you start dragging a dockable dialog. Patch by Ell. To be released in 2.10.16 soon. pic.twitter.com/kKbIaROeC3— GIMP (@GIMP_Official) February 2, 2020 Photography The RawTherapee team released version 5.8 with some notable improvements:New Capture Sharpening tool to recover detail lost to lens blur. For more info on this tool, I highly recommend reading this thread on Pixls, especially Ingo s explanation how he designed capture sharpening.Initial support for Canon CR3 files (no metadata yet, this would be courtesy of Exiv2 developers)Improvements of various camera models, better memory management, speedups and optimizations all around.Siril is a fun new project to watch. Several processing filters got live preview support and UI updates. Additionally, various performance improvements were contributed by Ingo Weyrich and Rafael Barbera. Animation The Morevna project team somewhat belatedly built their own version of OpenToonz 1.3.1 and is now working on newly released v1.4.0. The builds comes with extras not yet available in the upstream project: Krita-like assistants, advanced color selector, and script execution via command-line.Quite a few improvements in enve:Multiline math expressions supportPanning with right mouse button for laptop usersSmall audio-related improvementsBeginnings of the i18n supportVarious fixesThe math expressions feature is getting really interesting. Check this out:Maurycy is also teasing people with dynamic path operations: Gamedev One important bit of news that I missed in the previous recap is the release of Godot 3.2. Some of the highlights of the release:Mono/C#: Android and WebAssembly supportAR/VR: Oculus Quest and ARKit supportVisual Shaders overhaulFully functional glTF 2.0 import pipeline and initial support for the FBX formatWebRTC support thanks to a $50K grant from MozillaAndroid build and plugin systemsVarious coding tools2D: Pseudo 3D, Texture atlas, AStar2DAudio generators and spectrum analyzerSee here for vastly more details. The next major release will be Godot 4.0 featuring Vulkan-based rendering backend.On top of that, the Godot team got an Epic Megagrant of $250K to improve graphics rendering as well as GDScript, the built-in game development language. Video The Cinelerra-GG team made another monthly release. Some of the changes are:File by Reference this is pretty much media linking. Updated the file outside OpenShot, and it will be updated in all your projects the next time you open and render them.Windows 10 port is now available, with some limitations.The user manual has been converted from LibreOffice to LaTeX, new PDF file now available here.OpenShot 2.5.0 is out (grab it here). Highlights:Hardware decoding and encoding support (nvenc, VA-API)Completely rewritten keyframing system for better performanceEDL and XML importing and exporting for Premiere and Final Cut supportBrand new approach to generating thumbnails (via a local HTTP server)Blender 2.8x supportImproved backup and recovery systemA lot of SVG support fixesJudging by the changelog, there are indeed at least two active developers in the project now: Frank Dana joined Jonathan Thomas. This is more or less what Jonathan was talking about when I interviewed him for the video editors sustainability article a little over a year ago. Music There have been a lot of technical low-level commits to Ardour in the past few weeks. Last week, I saw some extremely cautious talk on IRC about v6.0 prerelease happening soon. Not that I m into speculations, but quite a few users are holding their breath so much that windows warp inwards. Artworks Cute Frodo by Nathan Mon o , painted with Krita: Even cuter orc warlord by Marco Iacopetti, made with Blender: BB-8 droid by Thiago Ushkowitz , made with Inkscape:
Highlights: new releases of ART, Synfig, and OpenToonz, massive improvements in GIMP, Siril, Blender, enve, and Olive, new beta of Shotcut, a great deal of bugfixing in Krita for upcoming v4.3.0. Graphics January has been all sorts of fun for many projects. While Krita developers were focusing on the challenging resources management rewrite and bugfixing, the GIMP team had several busy weeks fixing a few old bug reports. And when I say ‘old’, I mean ‘16 years old’ kind of ‘old’. I’d like to return to this topic in great many details when the next release is out, but for now, here is a quick list of changes.Sliders now use a compact style by default. Benefits: streamlined interaction, better use of screen estate, numeric input field doesn’t get in the way anymore.Tools are now grouped in the toolbox by default. This cleans up quite a bit of space and makes the default layout rather irrelevant in my opinion.The much hated “You can drop dockable dialogs here” message in the toolbox is now gone for good. Instead, GIMP highlights areas for docking once you start dragging a dock by its title bar.New 3D Transform tool: think effortlessly making an image stretch into perspective towards a vanishing point in a way that looks realisticComposite preview on the canvas: the layer you are transforming will not pop above the rest of the layer stack.Clipping preview on the canvas: you’ll see exactly how clipped layer will look after you are done rotating or perspective-correcting.Preview of transformation for linked layers: all linked layers will update on the canvas.New pivot selector widget: easily set position of the pivot to one of 9 presets.Easier layers merging: only see the anchor button when a floating selection is available, otherwise see options for merging multiple layers.All of this will be available in version 2.10.16. Photography Alberto Griggio released ART 1.0, his friendly fork of RawTherapee. It comes with UI changes, some features ported over from darktable, some new tools for local editing etc. You can find out more from the wiki page.I don’t usually cover scientific software, but Siril is great for astrophotographers just as well and it’s been seeing some nice improvements lately. There’s 32-bit float support in progress, the team is switching to librtprocess to make use of RawTherapee’s pack of algorithms, and they recently started doing a UI revamp where image preview is finally right in the main window. They also got some help to get flatpak and macOS builds up and running, with contributions from Alex Samorukov (GIMP) and the Pixls community. Ingo Weyrich from RawTherapee is helping them to improve Siril’s performance too. Animation The Synfig team continues hacking on the project. Most recently, they merged the sound dock implemented by Rodolfo Ribeiro Gomes. This is now available in the master branch and will be part of the upcoming v1.3.12 release leading up towards 1.4.0. In fact, the release has already been cut, the official announcement will be available shortly. Meanwhile, you can read more about the changes that will be part of v1.4.0 (that includes everything you will see in 1.3.12).There has been a ton of improvements in enve since my coverage in December:Undo/redo supportText effectsVertical and horizontal text alignmentPaths sculptingDropping files into Files dock now possibleFrame remapping for Scene links with floating point precisionSrcATop and DstATop blending/compositingMath expressions to change values Dwango released OpenToonz v1.4. Quite a few new features as compared to v1.3, here are just some of them:Guided drawing / autoinbetweeningCamera column in XSheetContext-aware toolbarGeometry tool driven motion pathsNext/Previous keyframe shortcutsFill tool for raster levelUpdated and reorganized menusCanon DSLR camera capture support (Windows only)As usual, only Windows and macOS builds available. Morevna Project will probably supply an AppImage soon. 3D Dalai Felinto posted a review of big projects for Blender in 2020. It’s absolutely a must-read if you are interested in what’s coming for the project.Richard Antalik, the maintainer of the video sequencer, is now working full-time on Blender, and five people joined the studio team to work on assets, the Cosmos Laundromat series, and new training material.Pablo Dobarro is killing it with the new Cloth brush: The Cloth Brush has an alternative plane falloff mode which drags cloth from an entire cross-section of the model at the same time. This results in a more intuitive way of manipulating cloth compared to the standard radial falloff sculpt brushes have. #b3d pic.twitter.com/VUjuNzlQSJ— Pablo Dobarro (@pablodp606) February 1, 2020And there are all kinds of improvements heading towards Grease Pencil: We are working in a lot of improvements for Grease Pencil. Thanks very much to @hypersomniac_ (EEVEE developer) for his magic!!! pic.twitter.com/Uu1nwDQg6p— Daniel Mart nez Lara (@_pepeland_) January 30, 2020Intel announced the release of OSPRay v2.0.0. New version of the open source (Apache License 2.0) raytracing engine comes with new Intel Open Volume Kernel Library (enhanced volume sampling and rendering features and performance), as well as direct support for Intel’s Open Image Denoise as an optional module. See here for more details. CAD Yorik van Havre posted an update about his work on FreeCAD in December ‘19 and January ‘20. Do check it out. Some of the highlights are: Blender add-on for exporting to FreeCAD native project files, multi-line editor for text blocks in the Draft workbench, gatching inside 3D view etc.Lately, there has been some activity in the LibreCAD 3 git repository courtesy by Akhil Nair and Florian Rom o. This is quite encouraging given that the v2 branch is rather silent. The changes are mostly small refactoring, UI work, drawing capabilities and continuous integration improvements. Video Brian Matherly contributed a vectorscope to Shotcut, and Dan Dennedy added a new Settings > Preview Scaling option. The beta of v20.02 is now out, you can grab and test it. Volker Kohaupt completed the rewrite of vokoscreen, his screencasting application. The new version called vokoscreenNG now relies on GStreamer rather than FFmpeg directly (unfortunately, this means no VP9 support, at least on my setup).MattKC came out of the blue in mid-January and delivered well over 300 commits to Olive accumulated in his private repository since early December. He didn’t stop at that though. Basically, it’s a huge chunk of the complete rewrite that started last May. I think the shortest summary of the changes would be “no stone left unturned . Here are just some of the major changes:New node structure, with multiple node iterations support.Olive can now read fragment and vertex shaders from nodes, various effects have been moved to external shadersExternal xml/shader loading for transitions (which are nodes) now possible, so that contributors could write external code for transitions.New caching system with cache visualization and a hybrid indexing system.New Export dialog has a preview area and OCIO settings.OpenColorIO configuration is a project-level setting, the viewer has a complete color management support, with configurable display, view, and look parameters.OpenImageIO is sipping through various bits of OliveKeyframing and curve editor for video and audio clips have been rewritten from scratchSupport for multilayered keyframes is now available, which means you can keyframe more than one value per input (e.g. a vec2, vec3, etc.).Rewrite of several timeline tools including transitions, as well as video/audio speed stretchingAutomated builds for Windows, Linux, macOSNow, here comes a warning in big fat red letters: Olive is nowhere close to being usable when used from git master. It crashes a lot and even plays audio clips after audio clips are over. There’s weird things going on everywhere. None of that worries me, because somewhere deep down, Olive is shaping up to become extraordinary now that it’s based on solid foundations like OCIO. My only concern here is that, similarly to other projects that I cover, Olive is being written pretty much by just one developer. This is unsafe as hell, yet there’s little to do about that until the end of the rewrite. I can think of just one other person who understood the old codebase well enough to consistently hach on it, and it’s jonno85uk who preferred to fork Olive into a project named Chestnut.One last thing to mention here is that I need to retract my earlier statement about Olive from the master branch not being able to drop clips onto the timeline. What actually happened is that Olive now requires manually creating a sequence first. Then you can drop media files. Previously, it automatically created a sequence with settings from the first video file you fed it with. This might be changed back, I just don’t know at this point. Music Stefan Westerfeld released liquidsfz 0.2.0, a free/libre SFZ sampler. Among user-visible changes, the new version features an LV2 plugin, support for <control>/<global>/<master> sections, and support for opcodes such as key switches, crossfading for layers. It also ships with more amp-related opcodes and allows changing more parameters using CC.And if SFZ is your thing, check out sfizz, another SFZ library for supporting this file format. There have been no public releases yet though.Ninjas2 slicer available as an LV2 plugin is out with redesigned interface. All controls are now divided into three groups (Global, Slicing and Slice), and the keyboard allows clicking on a key to play the dedicated slice. Builds for Linux, Windows, and macOS are available for downloading. Tutorials Aur lien Pierre wrote an article on darktable’s gradual switch towards linear RGB and away from CIE LAB for image processing. He also outlined which modules are and are not recommended to use right now and which modules should be used instead. This is an absolutely must-read for anyone using darktable.Nick Saporito shot a new tutorial on using the Pattern Along Path live path effect inInkscape: There’s a new Inkscape tutorial from PixieArt as well, this time on drawing a celtic knot: New awesome tutorial by Aidy Burrows on retopology tools in Blender 2.8x. Timelapse by TopChannel1on1 on modeling ancient Greek architecture: Art New artwork by Philipp Urlich, made with Krita: New Ubuntu animal, Focal Fossa, by Sylvia Ritter, also made with Krita:
Technical information on the website and how we made it.“We” are paperdigits (Mica), darix, and Pat David(the strange people behind PIXLS.US). Site This website is generated as a static website using the Hugo generator.The theme is based on a bare, minimal Bootstrap theme created by Mica for projects at PIXLS.USwith a little help from Pat.The webserver is nginx on openSUSE Tumbleweed. Type The font used is Libre Baskerville by Pablo Impallari and Rodrigo Fuenzalidaand is licensed under the SIL Open Font License v1.1.Libre Baskerville is a webfont family optimized for body text. It’s Based on 1941 ATF Baskerville Specimens but it has a taller x-height, wider counters and less contrast that allow it to work on small sizes in any screen.
Highlights: darktable 3.0.0 release and beta release of MyPaint 2.0, major updates of Kdenlive and FlowBlade, OpenToonz is approaching v1.4, Blender entered bug tracker curfew, quite a lot of excitement in the BIM department. Graphics The past few pre-holiday weeks have been slow for both GIMP and Krita projects (part of the GIMP team is meeting at the 36th Chaos Communication Congress in Leipzig as you read this though).Not so much for the MyPaint team that announced the first MyPaint 2.0 beta release. Major changes since the previous alpha release:Compatibility preferences now available to prevent your older projects from looking incorrectlyImproved categorization of brush settings (e.g. the Experimental group is gone)Faster painting with new symmetry modes One fun conversation that s been going on for a while now is the relicensing of FontForge from GPLV3+ back to BSD, the way the original creator of the project, George Williams, chose it to be. To sum it up, the sentiment seems to be that the decision to relicense to GPLv3+ was done poorly, and people who participated in this have mostly not been around since.Apparently the effort towards relicensing to BSD has been under way for a while now. It does meet some objections though, and some people on Twitter fear that once it s completed, Adam Twardoch would grab all he can into Fontlab (as if he didn t have a chance between 2000 and 2012).A far more interesting topic that gets raised in the thread is about the governance of the project. How much power does a maintainer really have when he/she is not the original creator of the project too? Photography The traditional Xmass-ish update of darktable is a huge one. Version 3.0 delivers both UI and core changes, all to provide a better workflow for the prosumer and pro market. Some of the highlights of this release:Far more customizable UIs with CSS-driven themesNew culling view mode improves the workflow for choosing what picture to rejectNew timeline options brings darktable further into the photo management domain by allowing to browsePerformance boost thanks to some operations now having implementations using both SSE, SSE2, AVX, AVX2, and AVX512 instructions. The program will pick the right one on the fly depending on the CPU your computer has.New operations: rgb curves, rgb levels, filmic RGB, tone equalizerRedesigned History dockerSecond image preview window for dual display setups For more info and screenshots, see the blog post. Animation Maurycy Liebner is now dealing with some extra attention to enve, his 2D animation application that I talked about earlier this week. Some of the changes coming in response to feedback from users are:Transform actions (rotation, scaling, moving) are now much more discoverableNew disable/enable paint onion buttonNew create node button.New play from the beginning buttonTop toolbar buttons now context-aware.Spacebar now toggles playback/pause (it s the little things )OpenToonz has some interesting things brewing for the next release. One major change is the coming support for stop-motion animation, for which the team decided to use the proprietary Canon SDK. The GPhoto option didn t work for them, here is what Jeremy Bullock told me:libgphoto2 is a pain to make work with Windows. Having to use winusb drivers can break other software I’ve done it. I have looked extensively for a solution that uses open source solutions but nothing is as seamless as using the Canon SDK.Some of the other recent changes are a vector guided drawing auto-inbetween option and refactoring of the Preferences dialog source code.According to the roadmap, the team is targeting early January 2020 for a release candidate and late January 2020 for the final v1.4 release. 3D Blender entered issue tracker curfew, which means the team now has stricter rules to define what constitutes a bug and how patches are dealt with. With the 2.80 release, the bug tracker exploded (similar thing casually happens to other big projects like GIMP and Krita when they hit a major milestone). Making things manageable is really, really important. It s great that the Blender team came up with a reasonable plan.Moreover, you don t wanna miss the report by Cl ment Foucault that covers his recent work on adding a simple lighting system to Grease Pencil.And then again, Pablo always has good stuff to share: The new vertex colors in sculpt mode now have alpha and eraser tools. You can sculpt and paint on semitransparent objects, with rendering support in Workbench and EEVEE. #b3d pic.twitter.com/P5pHhSKGb4— Pablo Dobarro (@pablodp606) December 29, 2019So does Jeremy when it comes to Dust3D: Remesh feature preview of #Dust3D, powered by #QuadriFlow. All the annoying trivial triangles turned to beautiful quads. pic.twitter.com/lvJDDhn8Pt— Jeremy HU (@jeremyhu2016) December 29, 2019 CAD QCAD 3.24 is out featuring the use of multiple CPU cores for rendering, better support for non-uniform scaling, and, particularly of interest to Linux users, Wayland support. For more info, see the changelog.Yorik van Havre posted an overview of things he worked on in FreeCAD the past few months. Most exciting changes are:First draft of a FreeCAD exporter for BlenderBetter default camera altitude (optional)Better IfcOpenShell detectionBCF viewer now available to view and create annotations for issues in IFC files, this is intended for collaborative work on architecture projects. There s also a new video where Yorik demonstrates applications that are part of his BIM workflow: FreeCAD, Blender, QCAD, Gimp, Inkscape, IFC++, QGIS, SweetHome3D, LibreOffice. One project I should have mentioned before is BlenderBIM that turns Blender into architectural design application with BIM workflow support based on IFC4. The most recent update has some nice new features like the per-object color support, but most importantly, an IFC clash detection utility (BCF output planned). Video Janne Liljeblad released Flowblade 2.4. Highlights:Exporting to Ardour project filesNew Standard Auto Follow compositing modeVarious compositor changes to improve image qualityNew transform filters, Position Scale (has a GUI editor) and Rotate and ShearComplete port to Python 3For a more complete list of changes, please see release notes.Kdenlive 19.12 is out. Most important changes:Newly added audio mixer with with mute, solo and record functions.Better performance.Master effects for video and audioNumeric input now available in Lift/Gamma/GainSee here for more details.Denis Roio announced the first release of frei0r effects in the last 2 years. It s coming with 3 new filters (normaliz0r, elastic_scale, premultiply) and major cleanup. Among new filters, elastic_scale is probably the most interesting one. It basically allows you to separate a frame into sections, and then scale sections in linear or non-linear fashion separately. Music After a warm and funny conversation on IRC about the humble beginnings of the Ardour project, Paul Davis ended up writing an entire post about project s history. Totally recommended!Rui Nuno Capela made the Xmas releases of all the programs he maintains: QjackCtl, Qsynth, Qsampler, QXGEdit, QmidiNet, the Vee One Suite, and Qtractor. Some of the programs got better HiDPI support, others got a UI color scheme editor, and for Qtractor, basic key signature has been added to tempo, time -signature and location markers map.If you aren t a big fan of the original setBfree virtual organ 3D UI, you might like setbfree-controller.lv2 written by Valentin Valls for MOD units. Art and showcases Motion comics made with Grease Pencil in Blender, by Maisam Hosaini: One of the latest works by Evelyne Schulz, made with GIMP: It s very nearly impossible to get tired of Blender renders by Mohamed Chahin: Winter holidays There are quite a few posts in the pipeline, including another one in the Introducing X application series (and it s a big one). However, content-wise, my topmost priority right now is the annual 2019 report for GIMP/GEGL projects. LGW will return some time in January 2020. Happy holidays!
One of my common mistakes is sticking around a new interesting project for a long time without telling more people about it. So without further ado, I m giving you enve, a free/libre 2D animation tool by Maurycy Liebner. The essentials Since people pay so much attention to project names these days, let s get it out of the system: enve means Enve is Not a Video Editor. Yes, it s a recursive acronym, deal with it :) Here is a quick run-down of what enve does:Timeline-based animation, automatic tweening, all objects and filters properties are animatableSupported objects: Bezier curve, ellipse, rectangle, text, brush strokesUses MyPaint s brushlib as the painting engine, relies on Qt s native graphic tablets supportShips with a basic selection of blending and compositing modes for objects (Porter-Duff, as well as Screen, Overlay, Color Dodge, Color Burn etc.)Supports multiple scenes per projectImports image sequences, video and audio filesOutputs anything that FFmpeg supportsHas separation into core and GUI and supports pluggable path and raster effects, including GLSL fragment shadersHas configurable preview resolution for better performance control, you can use presets or input anything between 0% and 999%Works on Linux, can be made to run on Windows and macOS (Qt)From the UX perspective, enve is a bit of a cross between Inkscape and Blender, which has a lot to do with Maurycy being an avid user of both, professionally. Just a few examples:You can use the path editing tool to edit rectangles and ellipses.When you edit a path, enve shows control points for two adjacent nodes so that you could easily tweak the shape.You can use G, S, and R shortcuts for moving, scaling, and rotating respectively, and for scaling, you can press X or Y to constrain the transformation to just one axis.The timeline design resembles that of Blender s Dope Sheet, with the benefit of providing direct access to numeric values of various settings.Similarly to Bender, a panel can be duplicated vertically or horizontally, so that you could e.g. have access to different areas of the timeline or the canvas. The design I tend to avoid the topic of architectural design of applications, but in this case, it pretty much defines the extensibility of enve. Moreover, everything that goes below in this section is a verbatim copy of what Maurycy wrote to me. He explained it nicely, there was no point in retelling it in different words.Enve is divided into two parts: the enve core library, and the app itself. The core library contains most of the backed code. The app, on the other hand, puts parts of the core library together and is responsible for the GUI.The core library can be used by users to create their own effects in C++. This includes effects that influence pixels (raster effects) and effects that influence shapes (path effects). All they have to do is to subclass the relevant classes from the core library, compile their effect, copy the resulting *.so file into their enve directory, and they will see their effect in enve context menu, ready to use. The core library contains animatable properties (float, integer, color, path, etc.) you can assign to your effect with just a few lines of code. Enve handles everything from GUI to saving for all the properties and the effect itself.There is also a special type of a raster effect called shader effect. Shader effects do not have to be compiled. All (animatable) properties for a shader effect can be defined in an XML based file. The property values will be passed by enve to a user-defined GLSL-based fragment shader that will be used when applying the effect. From the point of view of the user, shader effects are indistinguishable from precompiled C++ raster effects, but they are more suitable for simple effects, and can only be processed on GPUs.In theory, it shouldn’t be too difficult to expand plugin capabilities far beyond effects.Currently, continuous builds do not ship with the plugin development kit. The code changes quite rapidly, meaning the user-created pre-compiled plugins would have to be recompiled and altered quite often. The ongoing effort is to do a major code cleanup, focusing on the core library, to make it better suited for long-lasting user-created plugins. The good, the bad, and the ugly You probably already guessed that I m very excited about enve. I ve tried every animation tool on Linux there is to try. There are some extremely powerful and production-ready applications around, like OpenToonz, and Blender just keeps getting better for 2D animation.What can I say, enve instantly clicked with me. I have a pretty good idea what I need from an animation tool (granted, not much!), and I can do it without much stress with this program. Supporting GLSL fragment shaders for effects is just awesome. A lot of people have experience writing those, which means we can get from few effects to a ton of effects fast enough.But I m also well aware that the program is far from being perfect. It does have its quirks.UX-wise, the mix of Inkscape and Blender is not a bad idea, but some things are just not obvious. I discovered the S and R shortcuts (scaling, rotating) purely by accident, and I was not clever enough to discover the G one (moving) despite the obvious Blender-like interaction. Some buttons lack tooltips, and I just don t have enough experience to figure out what they do.The program would also benefit from a context-sensitive toolbar for tool settings, which Maurycy is already moving towards. E.g. a recent commit adds toolbar buttons to control brush size, they are only visible when you select the brush tool.Workflow-wise, some of the sorely missing features are easing functions (only Bezier-based interpolation available) and a working undo/redo support. Both are high on the priorities list though.All in all, I d say it s difficult and, frankly, unnecessary to bitch about a program that works well for its alpha development stage, has a nice design, and makes it comparatively easy for people to get started. Getting the latest and greatest In the past few weeks, the project has gone far from its initial v0.0.0 release in September 2019. Continuous builds as AppImage files are available on GitHub for every successful git commit.Windows and macOS builds aren t even planned yet and will most likely demand an interested contributor who will stick around. Further plans and how you can help There is no public roadmap yet. Maurycy says:I am yet to run out of things to do, so I do not spend much time planning.However, if you are interested to help out, some of the ideas are to create your own effects and/or make better examples and documentation for the effects dev kit. It would be a nice starting point for someone to get familiarized with the enve code. The origins Since I m clearly doing things backwards this time, let s finish this introduction with the origins of the project.This is what Maurycy Liebner told me:I work as a graphics designer (mostly 3D animations). Back in 2016, I was working on a short 2D vector animation. I tried creating it with Synfig, but for some reason, I found it unusable. I am not trying to discredit Synfig, I am sure there are people who use it successfully. It simply did not feel right for me at the time._I ended up using After Effects, which was a lackluster experience. During that time, out of frustration, I opened the Qt Creator and started a new project called AniVect. Back then I did not have any plans for it, I just wanted to write some code. I had no prior experience with graphics related programming, so I had no idea what I was getting myself into.Don t we all? :)
Highlights: development updates from IfcOpenShell and Apertus projects, Mantaflow and USD support to undergo review for Blender 2.82 inclusion, Ardour is getting built-in MP3 support (for realzies!). Graphics It has been a more or less quite week for both GIMP and Krita, although Agata Cacko was really busy hacking away on resources management in Krita s dedicated branch, fixing and improving the searching by tags etc.Not too many news regarding MyPaint, but if you were looking for a dedicated chatroom for the project, there is now an official Discord server. 3D Blender 2.81a has been released with bugfixes.Grease Pencil refactor won t make it to version 2.82. You can check out the greasepencil-refactor branch in Git though. Same with LANPR and VR: they aren t likely to undergo the review in time for bcon2, and since those are big features, they will probably be skipped in 2.82. On the flip side, Sergey Sharybin will review Mantaflow and Universal Scene Description patch/design this week. Speaking of which, Leif Edersen et al. published a blog post on Pixar s USD-based pipeline, both pre-production and production stages. No use of Blender, but an interesting insight into the inner workings at Pixar.There s the usual sculpting awesomeness from Pablo Dobarro: Thanks to the performance improvement of the Sculpt Mode vertex colors new tools with real time preview can be supported, like this Color Gradient Tool #b3d pic.twitter.com/55Vb98BJtN— Pablo Dobarro (@pablodp606) December 5, 2019In other news, Dalai Felinto moved to Amsterdam to work fulltime in Blender. GameDev R mi Verschelde says the Godot team is organizing a new GodotCon around FOSDEM in Brussels on 3 & 4 February 2020. They will also have a booth at FOSDEM and co-host the gamedev devroom on 1 February.Oh, and if you ve been wondering about new features in Godot 3.2, here is a video from GDquest explaining how to override the control over the game camera form the editor. Embark Studios, who are gold sponsor of Blender, released their Embark add-on for Blender. The add-on includes tools for standardized import/export gamedev workflow, 3D modeling, and new object types. Get it from GitHub. CAD Thomas Krijnen posted an update on IfcOpenShell 0.6 development. One of the biggest changes in the upcoming new version is support for multiple IFC schemas (IFC2X3, IFC4, IFC4X1, and IFC4X2) in the same executable and plug-in. Video The Kdenlive adapter by Vincent Pinon has been accepted to upstream OpenTimelineIO. This means interoperability between several NLEs, both free/libre and proprietary. The feature will be available in Kdenlive 20.04. Meanwhile, version 19.12 is about to be released.The Apertus team, who work on the open AXIOM modular camera, posted a new progress report, covering the topics of AXIOM Remote and Google Summer of Code 2019 projects. There s more information in a recent blog post. Music A certainly interesting project I ve only just run into is GridSound by Thomas Tortorini, Melanie Ducani et al. It s a work-in-the-progress DAW that works in your browser, it s free/libre (GNU Affero General Public License v3.0), and it has a working demo. Technically, it s an HTML5 app that relies on Web Audio API. It comes with a timeline, a MIDI clip editor, audio playback (you can drag and drop pre-recorded audio files), a mixer, a built-in synth, and a simple audio filter. Fun fact: Brian Eno even tweeted about GridSound two years ago!Robin Gareus recently updated Ardour s file format version to 6000 with this very commit message:Ardour 6.0 Alpha - Enterprise EditionIts 5 year missionTo explore strange new soundsTo seek out new bugs and new usersTo boldly go where no Ardour session has gone beforeSeem like alpha release is getting closer indeed!Among other changes, one thing I did not see coming at all is making MP2 and MP3 valid extensions. Which means these files can now be imported to sessions on Linux and Windows (it already worked on macOS because CoreAudio has native support for it). The implementation uses the minimp3 library by Lieff for decoding and seeking.Developers still don t like the idea of MP3 sources in mixes but claim that samples in MP3 are a legitimate use case.I haven t mentioned Qtractor for a while. Rui Nuno Capela recently added basic key signature support to tempo, time signature and location markers map.The Open Music Kontrollers project has been recently busy working on the Mephisto LV2 plugin. If you are interested in developing your own effects and instruments, you are probably going to like this. Basically, it s FAUST running as an LV2 plug-in. You can edit the effect code in your text editor of choice, then save, and Mephisto will rebuild and reload it to apply the effect to your track or bus. This is kinda early work, there have been no releases yet, but if you are feeling adventurous, go grab it at GitHub.There s more VCV Rack goodness coming: Coming in Rack v2: Three new modes for interacting with knobs using a mouse or touch screen: Linear (this is how knobs work in Rack v1), Scaled linear (move horizontally to adjust dragging speed), Absolute rotary, and Relative rotary. pic.twitter.com/EHbVO8Ks5E— VCV (@vcvrack) December 6, 2019 Tutorials More baby Yoda craze, this time from SouthernShotty: Louis du Mont created a Blender 2.81 tutorial on modeling a simple LCD face effect. Joko Engineeringhelp published a short introduction to the TechDraw workbench in FreeCAD: New Inkscape timelapse by grafikwork: Art and showcases Colin Levy completed SkyWatch, a Sci-Fi short he started working on back in 2012. All CG was done using Blender, with Andrew Price and Pawel Somogyi serving as 3D Leads. Ian Failes did an excellent behind-the-scenes interview with Colin at befores & afters.Well worth reading, if you are interested in how a project like this gets done!Martin Trokenheim s PWN character doesn t mess around! (Painted with Krita) The Great Wall by Philipp Urlich, made with Krita as usual:
Week highlights: lots of new releases all around, including Krita, AzPainter, Godot, LibrePCB, Audacity, and MuseScore, Canon CR3 support coming to RawTherapee, new ArmorPaint release shaping up nicely, and more. Graphics yvind Kol s fixed a bug in GEGL that used to cause GIMP crashes upon saving large files. It s entirely possible that a new Windows installer will be available soon without bumping GIMP s version.Krita 4.2.8 was released with bug fixes. The team has been working on resource management rewrite for many months now (Boudewijn started it), but this is the first time (that I know of) they mentioned it publicly, as this is where hours and hours of work go into lately.AzPainter 2.1.5 is out with small UI and workflow improvements (e.g. left double-click now selects all text) and newly added Italian translation. You can get the tarball from OSDN.Birdfont 3.32 was recently released with better tools for editing COLR paths (used in color fonts). A few bugfix updates have been available for this version as well. Photography The darktable team is entirely focused on preparing for the v3.0 release scheduled for the coming winter holidays.Meanwhile the RawTherapee team now has basic Canon CR3 support in a dedicated branch. Plus, support for more pixel shift files is coming, courtesy by Alberto Griggio and his friendly fork called ART. 3D Frankly, it s impossible to avoid Pablo s improvements in Blender s sculpting toolbox. And why would one? Face Groups, the new visibility system for Sculpt Mode, is almost ready. Most of the patches to add Face Groups support to the current tools are finished and they will be submitted for review after committing the main patch. https://t.co/9veIJ4hS4b #b3d pic.twitter.com/uB6djngdAu— Pablo Dobarro (@pablodp606) December 1, 2019By the way, if you are interested in what could make it to version 2.82, check out this email by Dalai Felinto. Exciting stuff: Sergey Sharybin appears to be assigned to work on VFX & Video features and is likely to finish VSE disk caching in time for 2.82 and will provide some clarity on the future of VSE.Lubos Lenco made a pre-release of ArmorPaint 0.7. Highlights: custom export presets with channel swizzling, read layers and masks with material nodes, udim tiles export, built-in UV unwrap, improved DXR support. GameDev The Godot team released version 3.1.2: bug fixes, usability improvements and documentation updates. If you are more interested in new stuff, you couldn t have possibly missed the first beta of version 3.2. But if you have, here s release notes. CAD LibrePCB 0.1.2 is out featuring design rule check for boards (missing connections, too small clearances etc.), BOM exporting to CSV, and printing to PDF. See the release notes for more details. By the way, if you are interested in some development background, there was a quick interview with Urban Bruhin regarding LibrePCB a month ago.Koen Schmeets fixed a bunch of issues in SolveSpace on Catalina and started making personal macOS builds (signed and notarized) of what will become SolveSpace 3.0. The project appears to be still in the transition phase between former and new maintainers.On the FreeCAD end, WandererFan has been applying a ton of fixes to the TechDraw workbench. Video Matt separated decoder retrieve functions for video and audio in Olive as parameters for both are too different.Dan Dennedy improved timeline editing in Shotcut: you can now move multiple timeline clips, do free-form movement for timeline clips, and a transition when dragging over a gap.Ayush Mittal contributed a search bar to Pitivi for easier command search, similar to what you know from Blender, GIMP, and other applications. Looks like this has the potential to be merged soon.First release candidate for Kdenlive 19.12 is out, you can grab the AppImage here. The team is working on just the bug fixes now. Music The MuseScore team released version 3.3.3 with bug fixes and usability improvements.FluidSynth 2.1.0 is out with a new reverb engine and a new stereophonic chorus engine, support for DLS sampler file format, and new audio drivers Oboe and OpenSLES for Android, WaveOut for Windows, and SDL2. See here for details and downloads.Audacity 2.3.3 is another recent release I should have mentioned in the previous recap. It s mostly code restructuring with some user-visible changes, such as:Equalization effect was split into two effects, Filter Curve and Graphic EQ.Leading silence now preserved in exports.Newly added quality setting on AAC/M4A exports.Grab your download here.If you are curious, what s in the pipeline for VCV Rack 2.0, you can also follow the development blog. There s enough exciting stuff there to keep you interested. Art and showcases TacoBurger Truck by Ian Hubert, all word made with Blender. It s a shot from a film he s about to release soon. People have been geeking out over baby Yoda in the Mandalorian series. Here s baby Jabba, by Juan Carlos Montes: Firefox disconnected by Ferry, made with Inkscape: An earlier speedpainting work by Sylvia Ritter, made with Krita: For more Krita fun, check out this thread on KritaArtists started by Philipp Urlich. He posted a landscape painting for other artists to work from, and it paid off nicely :)
Things easily get messy when I skip whole weeks, so this is going to be one of the weird recaps where events are both recent and, ugh, not so much.Highlights: new releases of GIMP, Cinelerra, and Blender, new tool in Krita, development status update from Ardour, new wavetable synth from the Helm developer, interesting work happening in the Flowblade project. Graphics GIMP 2.10.14 was recently released with a killer feature basic viewing and editing of out-of-canvas pixels, added by Ell.This is a kind of a big deal. Previously, you wouldn t be able to see pixels outside the canvas if you rotated the image. Nor would you be able to transform pixels in that area, color-pick from it, or use those pixels as a source for cloning/healing. All of that is possible now. Some of the missing bits are the support in the selection tools.Ell contributed another great new feature the Normal Map filter. It doesn t have all the features of the old 3rd party plugin yet, but it s definitely on the radar. Quite a bit of work was spent last several weeks on creating the new GimpProcedureConfig API and porting plug-ins to it. The point of the API is to allow preserving plugin settings across sessions (among other things). This is still kinda early work, as general GUI for plug-ins (e.g. how the switch between presets and what defaults you reset to) will need revisiting.The good news is that once a plugin is ported to the new API, further workflow and UI updates get picked up automatically, no need to touch the plugin code ever again. It s unlikely to be backported to the 2.10 series though.The Krita team did a ton of work on fixing potential security issues found in source code using Coverity, a static code analyzer. Another major change: Ivan Yossi added the much dreaded notarization support for macOS builds.Even more interesting: Kuntal Majumder merged his Magnetic Lasso tool to the main development branch, so it will be available in Krita 4.3.0. I spent some time using it on photos and comparing it to the Intelligent Scissors tool in GIMP (same tool). In fact, Kuntal looked at both GIMP s version of the tool and at Photoshop s version. It looks like he tried to make the best of both worlds in the end :) Here is the gist of what I discovered (Kuntal kindly corrected me on some of that):Both tools create an editable sequence of selection anchors along a high-contrast edge.The interaction is similar but not the same. GIMP s tool expects you to click roughly along the edge and then constructs a segment that tries to follow the edge between new and old anchors. Krita s tool does the same, but also allows to click and drag along the edge. In that case, it will automatically create new anchors as you drag.Krita s tool exposes four settings to control the algorithm, GIMP s tool doesn t expose any.On the other hand, GIMP s tool has optional antialiasing and feathering of the selection. Krita s tool allows antialiasing in the Pixel Selection mode, and feathering is applied separately.Both tools allow tweaking the position of selection anchors while you are still creating a selection (for GIMP, you need to enable the Interactive boundary option which is, for some reason, not used by default).GIMP’s tool allows inserting a new anchor between two existing ones at any time. With Krita’s tool, you need to close the selection loop first and then double-click in the middle of two points to add a new one.GIMP s Interactive Scissors allows atomic undo: press Backspace to remove the latest added anchor (then press again for the previous one etc.), for Krita, press Shift+Z for the same (it won t undo repositioning of a node in the middle of creating a selection though).Last but not least: with default settings, both tools make very similar selections and they both fail (which is kind of expected) when you create a new anchor too far from the previous one. In my experience, Krita s tool tends to place anchors more on the inner side of the object you are selecting, GIMP s tool follows the edge closer. I d put it down to just the amount of work done so far. I m sure Kuntal will apply tons of polish!See his blog post for some more info.A very cool patch being worked on by Shi Yan adds recording of a timelapse right from within Krita. If you ever were a Corel Painter user, you ll find it familiar.Raghavendra Kamath launched a new website called Krita Artists, seemingly modeled after BlenderArtists. Very cool project!Sergio Gonzalez released Milton 1.9.0, a resolution-independent painting application with infinite canvas. New features:Smooth brushesPressure-to-opacityRotation (alt to rotate)Canvas-relative brush sizesSadly, Milton still doesn t build on Linux without some hackarounds. I have a binary build of the previous release from a contributor on GitHub, a patch was supposed to be submitted, and yet here we are. Hopefully, someone will push this right to the end.Jesper Lloyd seems to have taken over maintenance of MyPaint and recently merged his patch that adds user interface language preference. Brien Dieterle keeps making improvements to bump mapping code in his private branch (you want building the spectral_log branch).Darby Johnston released several updates to DJV 2.0, featuring multi-threaded performance improvements and new color management UI. The color management implementation uses OpenColorIO, and you can load your own OCIO configuration. Here is some F-Log footage loaded to DJV, with transfer function added by Troy Sobotka (see the f-log branch in his filmic-blender GitHub repo): There is no support for Looks yet though, which is a kind of a transform meant to color-correct the output in a creative way (similar to what some people used to do with specifically crafted ICC profiles before). 3D The big news here is the release of Blender 2.81, featuring massive sculpting improvements, new remeshing options (OpenVDB Voxel and QuadriFlow), NVIDIA RTX support in Cycles, with an OptiX backend contributed by NVIDIA, Intel Open Image Denoise support, far better soft shadows in EEVEE, a ton of new features in Grease Pencil, and the list goes on. Seriously, check out the release notes. One thing I ve been noticing lately is that people now use unstable builds even more often than before, now that Pablo Dobarro is rocking the sculpting toolbox. Judging by the recent weekly notes, 2.82 will probably feature Mantaflow and LANPR (line render engine, worked on during Google Summer of Code this year again). There are new hires at Blender Institute. Julian Eisel is working on usability and VR/XR until spring 2020, with a prospect of further full-time employment. Sebasti n Barschkis is working on fluids and generally improving Blenders physics simulations until autumn 2020.Andreas Esau, known for Asset Sketcher and BPainter add-ons, has been working on editable measures for precision modeling lately. More work on the measure #ExactoToolsAdding active selection measure. Easily make it to a permanent selection. Measure from cursor. UI is going to change though. Planning to make it more easy then now.#b3d pic.twitter.com/W2ISkuL7gl— Andreas Esau (@ndee85) October 10, 2019 CAD Florian Rom o spent a lot of time in September cleaning up the opengl branch of what will eventually become LibreCAD v3, and in October, he merged it into the master branch.I haven t mentioned LibreDWG for a while. The project is doing pretty well, Reini Urban is extremely active. In October, he released version 0.9 featuring initial DXF importing (r2000 for now) using the new dynamic API he added a version back, a 3DSOLID encoder, various new API methods, and a bunch of bug fixes. See here for more info.Given the amount of changes that break the API, it s probably safe to assume that there will be quite a few more releases of LibreDWG before Reini calls it stable (he actually called v0.8 from June 2019 an alpha release). Video Matt re-implemented a bunch of tools and functions in Olive, revisited the Preferences dialog and worked on internal color management. There s also a new properties dialog for footage, where you can toggle alpha association (the checkbox currently says Alpha is premultiplied , which might anger a few people out there), set interlacing etc.Jean-Baptiste Mardelle created a built-in audio mixer in Kdenlive, the code is now in the master branch and will be part of the December release. More is coming to Kdenlive next: Vincent Pinon created an OpenTimelineIO adapter for Kdenlive files.This is what he said about a month ago on Telegram:It successfully imports a Kdenlive timeline, displays it in OTIO format, then re-exports to Kdenlive that Kdenlive can really open (missing all the unsupported stuff).So two things need to happen next: Vincent s code needs to be merged to upstream OTIO, then Kdenlive needs to be patched to call the otioconvert tool.That makes two free/libre NLEs diving into OTIO support now, the first one being Pitivi. All extremely exciting!Adam Williams released Cinelerra 7.2. This time, he focused his attention on audio related features, here are some of them:New Flanger, Chorus, Tremolo, and Multiband Compressor pluginsVU meters and band snapping in the CompressorPulseaudio supportSample-accurate keyframes for audio pluginsThis is actually the second Cinelerra release this year, which is kind of unexpected. Usually, Adam makes one a year.Three of the four new audio plugins already made it to Cinelerra GG, but the November release isn t out just yet. The October release of the program featured better HiDPI support (use Scaling Layout in Settings > Preferences > Appearance to configure), 25 new shape wipe transitions, faster AV1 decoding etc.The Flowblade project got two new contributors. If you ve been around long enough, you probably already know Pascal de Bruijn who wrote a bunch of articles on color management in Linux. Pascal tweaked render encoding options, bits of UI, and added frei0r s defish0r effect for lens distortion fixing.Nathan Rosenquist added Apple ProRes 422 as a proxy clips option and did some work on Ardour exporting. This got the Ardour team to do an interesting announcement: Robin Gareus pointed out Ardour s mid-/long-term plan is to support OpenTimelineIO.This is interesting, because up till now, sharing your Ardour project with someone using other programs was mostly done via stem exporting, as Paul has been quite vocal in his dissatisfaction in other options such as AAF and OMF in the past. Audio and music One of the most exciting news in this department is the first video demonstration of Vital, a new wavetable soft synth by Matt Tytel. Matt, who is known as creator of Helm, is currently running a closed beta testing program (which kinda makes sense, since otherwise he d be overwhelmed with feedback).Vital is really interesting, not in the least place because of MPE support. So if you happen to own ROLI s Seaboard or Haken Audio s Continuum, or, who knows!, an Eigenharp, this is right up your alley.Here is a rather long video of a talk by unfa, well worth watching. It s undecided yet what funding model Matt will pick for the project. It might very well be the case that I m advertising a project that will turn out to be proprietary (gasp), we just don t know. Matt will make an announcement when it s the right time to do so. For development news, you can follow him on Twitter. And there s an official Discord channel too.Chris Cannam did a series of releases in the Sonic Visualiser family. As usual, all great things for anyone involved with academic music research.Ardour now has an Autostart option in the Engine dialog at startup time. By enabling it, you d tell Ardour to automatically launch the selected engine with selected settings when the currently chosen device is available. So, one boring thing less to do when all you want is just making some music. Moreover, the startup code has been revamped to use finite state machine.Paul Davis started redesigning the declicking and fades around loop boundaries to provide smooth looping.Ben Loftis of Mixbus added support for two new control surfaces, Behringer X-Touch One and RuCo, the latter being a DIY controller by Russell Cottier made specifically for Mixbus users.Robin Gareus started working on a virtual keyboard right inside Ardour, so now you can use your regular computer keyboard as a MIDI input device routed to any MIDI track. Robin also started working towards VST3 support in Ardour. This is early work in a dedicated branch. It s kind of exciting but not necessarily going to be available in 6.0. We ll just have to wait and see :)You can read about most of that in slightly more details in a recent development update from Paul Davis. TL;DR: 6.0alpha is going to happen soon.Another thing worth mentioning here is an op-ed by Artemiy Pavlov on Korg s possibility to load custom plug-ins into Minilogue XD, Prologue, and Nu:Tekt NTS-1. The article was greeted by some interesting constructive criticism from Paul Davis in the comments section.Apart from Paul being Ardour developer, there are more reasons to mention this. First off, Korg s SDK is open source software (BSD 3 Clause). Secondly, before founding Sinevibes, Artemiy used to contribute to free software. For instance, he wrote a few LADSPA effects used in the internal signal chain in Hydrogen, a free/libre drum sequencer. So he s not just some random guy writing proprietary code.Finally, Korg is really not the first to create an open platform. Among recent projects, MOD s units are built around LV2 API that is far more capable, and you can create your own plugins using e.g. Faust, which is a far more versatile high-level programming language than what Korg provides. That said, deploying a newly developed plug-in is where Korg clearly wins (if one was to compare a synth to an FX box with MIDI/CV features). Tutorials I kinda hesitate to share this, because not everyone has the time and patience to watch 1h43m long video on just two darktable modules. I m doing it though mostly because I think that, together, filmic and tone equalizer are going to make a huge difference for darktable v3 users. You also definitely want checking out a new series of videos by Pierre on changes in upcoming darktable 3.0. The first one covers changes in the pixel pipeline: Blendernation published a great behind-the-scenes post on recent Ice Cream Shop artwork by Felipe Del Rio: Golden Ribbon made a tutorial on drawing an iOS-styled construction kit icon with Inkscape: Artworks Fictional Russian spacecraft -1 (Uranus-1), an asteroid destroyer. Blender/Photoshop. Modeled with Hard Ops and Boxcutter add-ons. Fletch Graham is doing some great Nodevember work with Blender. Here is just a procedural height map and micro displacement, with source files available to study: Anaud Imobersteg posted his new work called Roots to BlenderArtists. The scene was built in Blender 2.8, using Megascans and Mixer, Graswald, Poliigon. Rendered in Cycles and post-produced in Photoshop : Some chibi characters from Yudha Agung Gumelar, made with Inkscape: Valeria Vasneva published a new autumn moody painting with a bit of Halloween touch, made with Krita: Speaking of autumn, here s a new speedpainting from Philipp Urlich (Krita, as usual): And here is an interesting work by Ilya Bar, made with Krita, in fact!I am drawing mostly with polygon tool (not vectors). Each polygon on separate “Paint” layer. Here about 30 layers. For example fox has main body (red) layer. white fur layer, dark fur layer, highlight, and 2 shadow layers.
Philipp Urlich made a bit of a splash recently in the Krita community with immersive landscapes and experimental use of NVIDIA s GauGAN-generated images as the basis for new artworks. He was kind enough to agree having his works featured on LGW and answer a few questions.Hello, Philipp! Could you please tell me a little about yourself and your artistic background?Philipp Urlich: I m 44 years old and live in Switzerland with my beautiful wife and two daughters.I think my mom was the trigger to get it all started. She was painting traditional deco art, painting old wooden wardrobes and cabinets when I was like 4-5 years old. Some pencil sketches of flowers and portraits. There were these interesting series of teaching how to draw books she had from a remote art school in England, and I was utterly impressed by some of the sketches in there.<img src="https://librearts.org/2019/09/the-art-of-philipp-urlich/philipp-urlich-settlers.webp" loading="lazy" alt="Settlers" width="100%"/> Settlers I have a graphical design background and had traditional training in many different disciplines, such as drawing and fine art, photography, sculpting, even letterpress and typography. It was the time computers and the first photoshop started playing a role. But there was no internet yet.The 1 year pre-course art school maybe was the best time I ever had. Where I learned about art history and all the old numerous masters like you name it. Learned about composition, colors and shapes.As life goes, I wasn t really seriously drawing or painting anymore for almost 20 years and just started a couple months ago (March 2019) again with my new tablet.My impression from looking at your ArtStation profile is that you paint environments a lot more than anything else. What’s the most fascinating part about them for you?Philipp Urlich: Because landscapes or environments are very immersive. It s a window to another world.<img src="https://librearts.org/2019/09/the-art-of-philipp-urlich/philipp-urlich-cataclysm.webp" loading="lazy" alt="The Cataclysm" width="100%"/> The Cataclysm I m still in a phase to find my workflow. Try out different styles, techniques and try to experiment. Landscapes are for me the easiest way to do it and I just love landscapes.I m still trying out things while getting into it again. I feel landscapes give me a lot of freedom to mess around with bold shapes and colors and not losing myself on too much details. I just love nature.What’s your usual trigger for new ideas?Philipp Urlich: Sometimes I have this idea and I try to visualize it. Sometimes I use inspiration I had seeing in nature, in an image or reading or hearing something. Often I just start chaos modeling shapes and let it do things for me. I currently never use references.<img src="https://librearts.org/2019/09/the-art-of-philipp-urlich/philipp-urlich-the-great-wave.webp" loading="lazy" alt="The Great Wave" width="100%"/> The Great Wave Just a few days ago, I read a great interview with Robert Del Naja of Massive Attack that covers quite a few topics regarding the use of technology such as neural networks by the band.Part of his reply on AI boils down to this: an algorithm can generate “surprises” that contribute to the final record in a way that bloopers can affect the songwriting process and get musicians to do something entirely different. And, specifically, he says this: “Great artists can now steal with algorithms”.Some of your recent artworks use NVIDIA GauGAN-generated environments as the first step. Does Robert’s point of view apply to digital painting in any way, in your opinion? What’s your personal experience with GauGAN?Philipp Urlich: Very interesting question indeed. Good artists imitate, great artists steal. - Picasso. I think it applies to all art forms. To get out of the usual habits and create something new. There s currently also this interesting discussion going on whether generative computer art is real personal art and can be sold as that. What is it?<img src="https://librearts.org/2019/09/the-art-of-philipp-urlich/philipp-urlich-galioth.webp" loading="lazy" alt="Galioth" width="100%"/> Galioth What if you never knew there was a machine behind it? I think what Naja also meant was that great artists can now steal from a machine so nobody can say hey you copied Artist X! . Or it s just another tool to break out of things. Ever played with dices to create something random or moires or photographs blended together? How are sound effect engineers doing their sound effects?It s true and ever been like this. Artists look for new ways to get out of their usual loop and this new area of AI is just one tool out of many. As I mentioned in the previous answer, it s a common technique to draw something random chaotic to get to new inspiration for shapes and what not. Our brain works for that just as well. Happy accidents. Sometimes just a new special brush can ignite something.<img src="https://librearts.org/2019/09/the-art-of-philipp-urlich/philipp-urlich-visitors.webp" loading="lazy" alt="Visitors" width="100%"/> Visitors I think in the way of quickly producing material for ideas, prototyping and inspiration, it can be a great helper. GAN is still in its baby age and maybe we will see a lot of improvements in what can be done. GauGAN is a nice tool but it has its limits, and the artist is the initiator that has to do the input. What if there s also AI that creates segmentation maps for input?<img src="https://librearts.org/2019/09/the-art-of-philipp-urlich/philipp-urlich-on-a-sunny-day.webp" loading="lazy" alt="On a sunny day" width="100%"/> On a sunny day Wouldn’t it require someone to at least generate some requirements for AI to start with? So we would still have a human behind artworks?Yeah, I think of it as a random generator of shapes for landscapes with a trained AI. It could also be a mix of where you define where a key element like a river or mountain would and the AI completes it.<img src="https://librearts.org/2019/09/the-art-of-philipp-urlich/philipp-urlich-tree-brothers.webp" loading="lazy" alt="Tree brothers" width="100%"/> Tree brothers How did you discover Krita and what were the features that got you to stick with the software?Philipp Urlich: I think it was around 2017. I was just looking once again for alternatives to Photoshop and the other painting tools and was surprised by how many features it has, like animation, layers, and brush engines. It s all there and the interface is simple and intuitive. I fell in love on first try so to say.<img src="https://librearts.org/2019/09/the-art-of-philipp-urlich/philipp-urlich-hunter.webp" loading="lazy" alt="Hunter" width="100%"/> Hunter My most used features are color pickers and E (eraser). Most importantly when switching to Eraser, the selected brush has to stay the same (hello Ps).<img src="https://librearts.org/2019/09/the-art-of-philipp-urlich/philipp-urlich-coming-home.webp" loading="lazy" alt="Coming home" width="100%"/> Coming home What are the things that you’d love Krita developers to improve/add?Philipp Urlich: Performance and stability is always something that can be improved and they did a lot recently. There s work arounds for many things if you work on large image sizes. Other than that, I can t think of any currently.<img src="https://librearts.org/2019/09/the-art-of-philipp-urlich/philipp-urlich-trance.webp" loading="lazy" alt="Trance" width="100%"/> Trance Who are some of the artists you look up to?Philipp Urlich: There s many, just a few: Frazetta, Moebius, Degas, Sargent, R. Schmid, A. Bierstadt, Schischkin, Waterhouse, Jean L. Gerome.
Week highlights: Inkscape 1.0 beta released, GIMP is getting built-in Normal Map filter, Krita team brings more improvements and bugfixes, darktable team is wrapping up v3.0 development, new versions of OBS Studio and Shotcut. Graphics It s been a few interesting weeks over at GIMP and GEGL.First off, the master branch of GIMP can now optionally be built with Meson thanks to F lix Pi dallu. There are more bugs to flesh out, but it basically works. For developers, this decreases local build times. Most users will probably be unaffected.Ell continued his work on the out-of-canvas feature set, adding an option requested from users making it possible to preserve canvas padding color instead of using the checkerboard when the Show All option is on. You can either set it on per-image basis or make it used by default. Michael Natterer and Jehan continued working on plugins API. In particular, Michael started addressing a few 17 years old feature requests asking for a way to make plug-in settings be persistent across sessions and a Reset button. An existing patch for the former is yet to be pushed to the main development branch, the latter already works in the old Despeckle filter used as a testbed.Another new feature added by Ell is changing compression type for tile swap. It looks puzzling and overly technical until you know what his intention is:I have a long standing plan of automatically ramping up the compression when you’re running out of swap space, to buy you more time to save everything and regroup :) Right now, this option is there mostly for experimenting/extra control.Ell also contributed a new GEGL operation, Normal Map, currently sitting in the workshop, which means it s not yet built by default. Basically it s because he s not done with it yet. Some features like filter type choice are still missing. It s hard to say if this will make it to GIMP 2.10.4 (if, like me, you build GIMP with workshop enabled, you don t need to worry). We ll see.The Krita team recently released version 4.2.6 mostly with bugfixes (over 120 people participated in beta-testing). Two new features are: New layer from visible command now available in layer s right-click menu, and Angle is now used as the default renderer on Windows.The master branch is seeing some good action too. Agata Cacko added a simple progress bar for saving KRA files to improve visual feedback. Thanks to Lynx3d, screen color picker can now pick from reference images too. Oh, and Boudewijn Rempt fixed a crapton of resource and memory leaks.Wolthera continues hacking on SAI files support in a dedicated branch of Krita. Recently, she added some tests to validate correctness of loading the data, then added basic layer style support, basic masks support (more fixes to follow), implemented the Binary blending mode, fixed clipping groups to load correctly, and added support for reading/applying the DPI value.Inkscape 1.0 beta is finally out! This has been years and years in the making, and it will hopefully soon be completed. Some of the highlights of the upcoming release are:Optional coordinates origin in top left cornerCanvas rotation and mirroringBetter HiDPI display supportCenterline tracingTons of live path effects improvementsVariable fonts supportMacOS users will also love native UI and signed/notarized .dmg files.One of the interesting aspects of the beta release is the new multicolor icon theme and advanced theming options. Basically, the theme is designed around several key colors that can be changed in the Preferences dialog (the red, the green, and the sky blue colors on the screenshot above).Downloads are up at https://inkscape.org/release/1.0beta1/platforms/. Photography The darktable team seems to have started wrapping up writing new code. The next release, v3.0, is likely to be done around winter holidays time.Aur lien Pierre finally merged tone equalizer, a darktable module he s been working on for a good part of the year. The module is essentially another take at separating lightness into zones (blacks, shadows, midtones etc.) and adjusting them selectively. The module has some on-canvas interaction seen on both screenshots: hover over a region that belongs to a zone, then scroll the mouse wheel up or down to adjust EV. Adjacent zone will be compensated for, and unrelated zones won t be affected at all.There s slightly more advanced UI that displays zone and the histogram and allows painting right over the EQ curve to tweak it. For a background information on this feature, there is probably no better source than a dedicated thread over at Pixls.us. (One more important change, filmic v3, is better left for the next weekly report).Speaking of which, another new fun project is ART, or Another RawTherapee. It s a friendly fork of the well known photography application, also announced at Pixls.us. Alberto Griggio started it to flesh out some ideas for the original project. He ended up sticking to his fork because of how far the changes went. So far, Alberto seems more inclined to focus on local editing tools, in particular advanced masking tools, reusing darktable code/ideas where applicable (his tone equalizer is based on an earlier version of the darktable s module), and streamlining the pipeline to his liking.Source code is over at BitBucket.Franco Comida merged the librtprocess integration code into the main development branch of Luminance HDR. If you want to know, how this affects tone mapping in terms of rendering quality, see his old/new previews from a thread on GitHub. The improvement is quite spectacular. 3D and VFX Blender news are nicely packed in another Blender Live session by Pablo Vazquez: As usual, more stuff from Pablo Dobarro: Voxel Remesh update:- Better topology generation. It produces the same level of detail using fewer vertices. - Volume and detail preservation. It only updates the areas of the mesh that changed. The mesh smoothing problem is now fixed. #b3d pic.twitter.com/RghgBPfWyC— Pablo Dobarro (@pablodp606) September 20, 2019And more: Quadriflow now supports mesh symmetry. This drastically improves the performance and the quality of the results #b3d pic.twitter.com/05d93QUOBq— Pablo Dobarro (@pablodp606) September 18, 2019First appleseed 2.1.0 beta is released, featuring things like OSL shaders compilations on the fly, full support for Cryptomatte, and render checkpointing i.e. resuming multi-pass renders after they were interrupted. Video Hugh Bailey et al. finally made a much anticipated new release of OBS Studio, the free/libre video broadcasting and screencasting software. Some of the highlights:Ability to pause recordingNew option to automatically adjust bitrate instead of dropping framesAbility to select multiple sources on the previewBrowser sources can now have their volume adjusted via the audio mixerFixed hardware acceleration support for decoding media filesGet it from the project s website.More than that, Twitch joined NVIDIA and Logitech in sponsoring Hugh s work on the project and committed to an unannounced annual donation that (it is safe to assume) surpasses 50 grand. The team will also have a booth (first time ever) at TwitchCon 2019 in San Diego.Dan Dennedy released Shotcut v19.09.14 featuring multi-select for playlist and timeline, new default shortcuts, six new video filters, some other improvements, and a bunch of bugfixes. See the news post for more details.Matt completed the transform effect in Olive, then went on setting up continuous integration including automatic builds for Windows, macOS, and Linux (AppImage). Hint: you can now grab Windows builds in the Artifacts section at Appveyor, but be warned that it s alpha quality code. Great new features. But not ready for production yet. But so tempting And yet Oh well.Alexandru B lu created a merge request for Pitivi, that adds editing nested timelines a new feature developed by project s GSoC student Swayamjeet Swain over the summer. It looks like there s some cleaning up to do before this can be merged.Not a ton of things going on over at Cinelerra GG, but they recently removed a timebomb placed by Adam Williams over 10 years ago. It made Cinelerra unusable if the currently installed copy was too old (and thus there was a chance that any reportable bugs were already fixed in newer releases).Meanwhile, Einar R nkaru is single-handedly working on Cinelerra CVE. Most of the work is low-level under-the-hood stuff, although the keyframable Crop effect sounds end-userish enough to me. Audio and music Andrew Belt keeps expanding the ecosystem of VCV Rack. The modular synth now has a separate Chords module which is a quad-note chord sequencer, and you can now write new modules in JavaScript (support for more languages is coming). Nils Hilbricht et al. announced initial schedule for this year s Sonoj convention that is taking place on October 26-27 in Cologne, Germany. Some of the topics are JACK, Qtractor, Vital (a new synth), recording sample libraries from acoustic instruments etc. Tutorials New Krita timelapse from grafikwork: New Blender tutorial by Nita Ravalj covers the topic of modeling fur: Rositsa Zaharieva posted a new the-making-of timelapse for a painting she recently did with Krita. Inkscape basics with Nick Saporito: Artworks and showcases The most impressive, hilarious and god knows what else showcase this past week was an attempt by Grant Wilk to create a microprocessor with Blender. Not model one, actually create one. Here's a peek at the memory system for my Blender microprocessor.Each register dynamically maps a block onto a plane depending on the the address and width. The image will then be rendered out on a frame update (clock tick), and read back in on the other end.#b3d #blender3d pic.twitter.com/5ziDGOFFS7— Grant Wilk (@remi_creative) September 22, 2019Some more information: Ladies and gentlemen ...Tonight I cracked the code to storing memory using Blender's node editor.This means that I can continue engineering my node editor microprocessor that will eventually become a computer.Stay tuned. Stay creative. #b3d #blender3d pic.twitter.com/uQPel1QHJo— Grant Wilk (@remi_creative) September 21, 2019Or just follow Grant on Twitter, this is fun!Atheris Hispida might indeed one of the inspirations for dragons as creatures, as suggested in a BlenderArtists thread for this Cycles render of one, made by Lucas Falcao posted a few close-ups from his recent personal render that looks like taken out from a very cool animated movie. All done with Blender/Cycles. Here some close up renders from the cat scene. #b3d #cycles #blackcat pic.twitter.com/VZwA9Stpkx— Lucas Falcao (@lucasfalcao3d) September 18, 2019Raghavendra Kamath posted another artwork made with Krita: Marcelo Queiroz has been posting his renditions of DC superheroes on Inkscape s Facebook group for a while now, all work done with Inkscape and GIMP:
Week highlights: out-of-canvas pixels now possible in GIMP, Krita team goes on a bugfixing spree, lots of changes in upcoming Blender 2.81 and FreeCAD 0.19, GSoC project for OpenGL rendering in LibreCAD v3 now completed, more work-in-progress goodness in Olive. Graphics There s more under-the-hood work on GIMP s plugin system, but there have been user-visible changes too, and you are likely to love those. Ell contributed a new feature: showing all pixel data outside the canvas boundary. It comes with an optional canvas boundary display (red dotted line).When enabled, the padding around canvas gets replaced with common alpha checkerboard, and all content outside the canvas is revealed. Ell also made it possible to use several tools outside the canvas. For now, it s painting and cloning tools. Adding support for selection tools will probably bring GIMP considerably close to a full-blown implementation of unbounded layers.Alessandro Francesconi released another new version of BIMP, visual batch processing plug-in for GIMP. Two latest updates feature improved GUI flexibility and HiDPI support (contributed by one of Pencil2D developers), as well as support for WebP and HEIF.Before HiDPI fix: After HiDPI fix: Dmitry Kazakov, Boudewijn Rempt, Wolthera, and Lynx3d fixed probably a dozen of memory leaks in Krita and twice as many general bugs here and there. There s more progress by Wolthera in adding support for SAI files: fixes for layer blending modes, masks support.Meanwhile, Kuntal Majumder continues working on his GSoC project, the Magnetic Lasso tool. Recently, he added an ability to cancel selection and start anew, as well as edit checkpoints. He then implemented basics of lazy filtering.Some members of the GNOME team are now promoting Obfuscate, a new program created by Bilal Elmoussaoui, specifically designed to blur and redact sensitive information on screenshots and images, like in this case part of tabs in Chrome: 3D and VFX Pablo delivered another great recap of recent changes in Blender, there s not much need in repeating that, just watch the video :) You are going to love this extra though: I added a new set of properties to the paint brush to match most digital painting applications.Brushes now have opacity, flow, density, hardness, wet paint, tip shape controls and tip rotation. #b3d pic.twitter.com/NfPehgkso8— Pablo Dobarro (@pablodp606) September 4, 2019Or maybe this update on the eyedropper tool in Grease Pencil? Check out a new photogrammetry add-on for Blender. There s nice coverage over at Blendernation.One more important topic here is free distribution of paid add-ons for Blender. This was an interesting conversation to watch: The Wordpress ecosystem has had this controversy ages ago, Blender is comparatively late to the party, and yet it s a conversation we needed to have. CAD LibreCAD s only GSoC project, OpenGL rendering, is now complete. Kartik Kumar continues working with Florian Rom o and Armin Stebich on cleaning up the code. For details, see his final report.Interesting things are going on with FreeCAD. There s a newly introduced Points workbench by Jean-Marie Verdun, that provides tools for working with point clouds. The Check Geometry tool (verifies if you have a valid solid) got more settings.More interesting things, courtesy by Victor Titiov, are going on with the Show module. Apparently, you can now have multiple temporary visualizations in arbitrary order and a plugin system. As the first tangible outcome, this helps allowing another workbench to do sketch editing.The OpenSCAD workbench now supports extrusion with an angle, and the DraftFillet tools got a new option to change fillet to a chamfer, courtesy by vocx-fc. Finally, there s a ton of updates in the FEM workbench by Bernd Hahnebach. Video I find it hard right now to report on Matt s progress with Olive, because the master branch is pretty much unusable. So here is an unordered list of recent changes:OpenImageIO support (cheap access to OpenEXR, DPX, Cineon etc.)Lots of color management work doneFunctions for alpha dis/re/associationAlpha over and opacity nodes are functional, transform (2D, 3D, and 4D) node is partially functionalNode caching and render caching improvementsNew slider widget that can handle both integers and floatsNew playback controlsReally, I can t wait to see all this in a usable state.Jonathan Thomas continues doing good work with OpenShot. There hasn t been a release since March yet. But most recently, the program got support for Blender 2.80. This is where I just have to quote Jonathan:On a side note, I really love the new version of Blender. It is very inspiring, the entire Blender story leading up to this release. It will continue to be an inspiration for OpenShot and myself. Good job Blender devs!!!!!Cinelerra-GG now uses libdav1d for AV1 support by default instead of libAOM, which is part of the most recent release. It also got a new crop plugin and timeline bars, which aren t in any release yet.GNOME Subtitles is seeing more activity lately again, both new releases, v1.5 and v1.6, have bugfixes and small enhancements rather than new features. Tutorials A good introduction to darktable, made for PetaPixel readers: New photography postprocessing tutorial for GIMP users by Davies Media Design: GDquest explains using file layer in Krita to make game art mockups: And here is a timelapse showing how to make a low-poly eagle logo with Inkscape: Artworks and showcases I ll never get tired posting new artworks by Philipp Urlich, made with Krita. This one is based on a Gaugan render that he did. More speedpainting with Krita from Sylvia Ritter: Urban environments by James O Brien, made with Blender, are always good solid stuff. Puffin spotting on Cannon Beach is a short animated film by Zale, made with Blender and based on a Zoe Persico s illustration. New FreeCAD showcase is a gas turbine for a radio-controlled model aircraft, based on a 1992 design by Kurt Schreckling. It was designed with upcoming FreeCAD 0.19, although no assembly workbench was used as there was no need for it, or so the author claims. Random things The funniest thing made with Blender I ve seen in a while: Procedural Burger System #WIPBlender 2.80 Eevee#proceduralmodeling #proceduraltexture #AnimationNodes #b3d #blender3d #blender pic.twitter.com/tWD6FNbWhO— sakura (@sakura_rtd) September 8, 2019But procedural designs go even further: I made a procedural solar panel texture. not perfect yet, but I'm getting there. The bevel at the edges however, is not procedural, this is just geometry. Whatcha think?#blender #blender3d #b3d #blendercycles #blendereevee #eevee #solarpowerrrrrrrrrftw pic.twitter.com/H9nZP4zOJg— BluePixelAnimations (@BluePixel2017) September 9, 2019
Week highlights: Krita team begins working on SAI files support, new releases of StereoPhotoView, Blender Power Sequencer, and Flowblade, even more sculpting awesomeness in Blender. Graphics For GIMP, Michael Natterer and Jehan Pages spent the entire week porting plug-ins to new APIs, so not much fancy stuff going on.One of the most interesting things going on with Krita right now is 3rd party funding to support SAI files via newly developed library called libsai (apparently, made by someone from Epic games). Only loading is likely to happen because of this:Writing probably isn’t going to be possible… The file format is completely crazy, a kind of virtual file system with encryption keys. The work done at https://gitlab.com/Wunkolo/libsai is amazing.It looks like this is FreeHand all over again.Wolthera is currently adding decryption support to make it possible loading actual bitmap data, not just the layer tree.Other than that, the team is mostly fixing bugs and preparing for the release of Krita 4.2.6 (beta is available, and the team needs your feedback). Photography Alexander Mamzikov released a new version of StereoPhotoView application that allows viewing and basic editing of stereoscopic images and videos. New features: variable alignment (scene depth) in the stereoscopy video content, gallery display and navigation when opening a photo from separate sources, automatic rotation when loading JPEG as per Exif orientation tag. 3D and VFX Pablo Dobarro does a lot with sculpting tools in Blender, this is one of the most amazing recent changes: Regularized Kelvinlets brushesA new set of brushes based on physically correct elasticity #b3dhttps://t.co/gGQ1NDF426Original paper: https://t.co/9NblishaXp pic.twitter.com/nQQi40hn8k— Pablo Dobarro (@pablodp606) August 30, 2019For even more Blender changes, here s a new weekly video review: Blender team is also looking for a UX and web designer. See if you are interested and fit the criteria.Nathan Letwory released a new version of his .3dm (Rhino) importer for Blender, with preliminary curves support.Stephen Agyemang, appleseed s second GSoC student, posted his report on implementation of practical path guiding in the renderer, that he worked on over the summer. This technique allows improving renders where indirect lighting is involved. See here for more details.Fr d ric Devernay cut another release candidate for Natron 2.3.15. CAD Yorik van Havre posted his monthly update about the work he and fellow team members did on FreeCAD, mostly in BIM and Arch department. Some of the highlights:BuildingParts now have a built-in, implicit section plane.Various TechDraw ArchView and DraftView improvements.DXF importing/exporting is now done with correct line color and style.Even more importantly, the Link branch has been merged to the main development branch and allows FreeCAD to share object data (e.g. geometry) with other objects, inside or outside the file. This is pretty much a prerequisite for assemblies.See the full report for more info. Video The GDquest released Blender Power Sequencer 1.3: Version 1.4 already in the works, if you missed some of the new features announcements: Coming in Power Sequencer 1.4: a much nicer trim tool. Inspired by @OliveTeam's edit tool. Supports snapping, and trimming all channels at once.#b3d pic.twitter.com/Dq4flq4sME— GDQuest (@NathanGDQuest) August 21, 2019Nathan Lovato is also giving a talk about VSE at Blender Conference in October.Janne Liljeblad released Flowblade 2.2. Some of the new features are: RotoMask and FileLumaToAlpha filters, LumaToAlpha compositor, and some UI updates for the titler and the keyframe edit tool. See here for more details. And here is a video that demonstrates using the roto mask. The Pitivi team started merging GSoC code. The marker bar is now part of the master branch in Git.As for Shotcut, Dan Dennedy introduced multiselection to the timeline (as well as Select All/None actions) and added several new video filters: Blend Mode, Elastic Scale, Threshold, Posterize, Halftone, and Dither. Audio and music Will Godfrey released a new version of Yoshimi, a free software synth. Some of the highlights are: extensions to AddSynth voices and modulators, a new AddSynth noise type, extra mute options, a global bank search entry.Robin Gareus added pYIN support to Ardour for frequency estimation in audio. The change was introduced after looking at what David Healey has been doing with Lua scripts in Ardour: Robin also added progress notification for Lua scripts execution and introduced support for new LV2 extensions (backgroundColor, foregroundColor, and scaleFactor) that allow a host to inform plugins on host color theme and UI scale factor to play better with non-default themes and on HiDPI displays.The change requires patching both the host and LV2 plug-ins code. Here is what you get with default Ardour theme in upcoming version 6 and Robin s limiter plug-in: And the same with a brighter theme called Blueberry Milk: Among other noticeable changes in the program, Nikolaus Gullotta of Mixbus fame added sortable Time Span, Length, and Range name columns to exporting dialogs. And Len Ovens continues his work on the foldback bus. JP Cimalando made the initial release of an LV2 effect called stone-phaser, a phaser similar to the original vintage Small Stone pedal from the 70s. Tutorials Nathan Lovato explains how to create a tileset for a game in Krita with file and clone layers: Xavier Shay explains how to recreate a Juno-60 with VCV Rack. New Inkscape timelapse from grafikwork, this time on drawing chocolate icong donut: Art and showcases Barandanduen posted a new artwork made with GIMP: New artwork by Ray Waysider, made with Krita: Fenec fox render by Kanishk, made with Zbrush and Blender: Felipe Torents did a new lighthearted animation with Blender:Spring is coming! #b3d #blender3d #eevee #spring #paraguay pic.twitter.com/GxJSj5Cuil— Felipe Torrents (@FelipeTorrents) August 27, 2019 There s more Blender goodness on Bart s weekly review of best artworks.
Week highlights: lots of under-the-hood work in GIMP, new features in Krita and Blender, new release of Kdenlive, CUPS changes the license, a variety of projects are wrapping up their GSoC participation for this year and post updates. Graphics Part of the GIMP team met at Chaos Communication Camp near Brandenburg (Germany) for a hackfest. They spent most of the week improving the new plug-in API and making plug-ins use it.Additionally, Michael Natterer rewrote memory management for plug-ins, and Jehan (not present at CCC) merged his branch that adds object-oriented like approach (discussed in the previous week recap). He continued working on submission of signals from core to plug-ins in a separate git branch though.There s also some talk on IRC about adding a user preference for associated/non-associated alpha as a switch in the Image menu. Let s wait for this to be actually delivered, but it s good to know this is on the radar.There haven t been many feature changes in **GEGL **(save for the Meson port), but yvind Kol s added a proper greyscale color spaces support to the babl library and made a new release.A few people asked me for an opinion on the fork of GIMP called Glimpse. At first, I considered posting in detail about Glimpse but then thought better of it. Here is what I can say on the matter, and since I m a GIMP contributor, please take this with an extra bag of salt.GIMP team has been suggesting to fork it in extreme cases (such as rebranding) for years. It is perfectly fine to do so as per terms of GNU GPL, although, so far, most attempts have been unsuccessful.Contributors to Glimpse have never been GIMP contributors in the first place, they aren t known in the GIMP community, and they don t seem to have any experience programming digital content creation software, so there is no real fragmentation so far.I spent ca. two weeks on Glimpse communication channels to figure out if they are the real deal. There is a clear and rather disturbing difference between how Glimpse contributors/moderators claim they treat the upstream project and what they actually do and say about GIMP. This is the opposite of impressive.The mutual hostility between supporters and haters of Glimpse doesn t bring any value to the overall community. If you are among haters of Glimpse, please consider leaving them alone and letting them give it their best shot. Likewise, you are not getting anywhere by annoying GIMP developers.The Krita team has been mostly fixing bugs. E.g. Dmitry Kazakov fixed absolute brush rotation on rotated canvas.However, Boudewijn Rempt also reverted the removal of JPEG2000 support via OpenJPEG library that he did in 2016, and updated the code to use present-day API of the library. This is currently in a branch.Miguel Lopez better known as Reptorian contributed Spiral and Reverse Spiral modes for the Gradient tool. This is really fun! I witnessed Reptorian going from being hard on developers on Reddit a few years back to becoming a valuable code contributor (delivering quadratic blending modes and a high pass filter). Take notes, people! :)Nathan Lovato submitted GDquest s Batch Export add-on to Krita for review and inclusion as part of the upstream project. Speaking of which, there s another interesting merge request by Dmitrij Antsevich, adding an Export Group as Layer switch for the exporting plug-in, so that each layer group would be flattened into a single respective layer for exporting.The FontForge team is taking a new approach to communicating to users. Fred Brennan picked up the stale Twitter account and started turning it into pure gold by showing new features and recording quick video tutorials explaining the basics of using the font editor.FontForge has always had a frustratingly buggy Expand Stroke feature. Scores of known issues in it exist.With great joy, then, do I tease a replacement which works on any convex shape. While imperfect, it solves all known issues in its predecessor.(Author: Skef Iterum) pic.twitter.com/xmhq5sdxYV— FontForge (@FontForge) August 20, 2019 CUPS 2.3.0 is out and now ships under the terms of Apache 2.0 license rather than GPL/LGPLv2, although Michael Sweet added a GPL/LGPL exception that you can read at the bottom of the NOTICE file. This shouldn t come as a surprise given that Apple has been owning the project since 2007.Back in July 2007, when Michael revealed the acquisition, he stated:CUPS will still be released under the existing GPL2/LGPL2 licensing terms, and I will continue to develop and support CUPS at Apple.Well, this lasted a whopping 12 years.On the code level, the new release adds support for IPP presets and finishing templates, brings a variety of bugfixes, and includes a new ippeveprinter utility (based on the old ippserver sample code). For more info, see the release log (some new features are mentioned in respective release logs of betas and release candidates). Animation Synfig had a successful Google Summer of Code participation. Here are reports from their students:Vectorization of Bitmaps by Ankit Kumar DwivediExport animation for Web using Lottie by Anish Gulati It s been a while since I last posted anything about Pencil2D. Most work these days is done by Oliver Stevns and someone known as scribblemaniac. Over the summer, they improved the UI here and there, added configurable constraint rotation, and fixed some bugs. The work isn t very fast but rather steady which is great. Their latest release was done at the spring/summer edge, you can read more about it here.The OpenToonz team has been applying pull requests on GitHub in batches lately. This may or may not mean there is a new release coming. 3D and VFX Pablo Vasquez did another awesome review of recent changes leading up to Blender 2.81: outliner changes, Intel s denoiser, voxel remesher, Math node etc. Some of the other new things in Blender are:White Noise nodeNew snap options: Edge Center and Edge PerpendicularNew Grease Pencil operator Merge by DistanceDo you know new #greasepencil operator Merge by Distance? #b3d #b2d pic.twitter.com/4E8K7PY3Eo— antonioya (@antonioya_blend) August 19, 2019 Even more, there s a new proposal for updated particle nodes UI which deals with issues pointed out in the previous proposals, namely, the connection between particle types and their behaviors not being obvious enough, and many (potentially) disorganized floating nodes in the node tree.Soft8Soft finally released Verge3D 2.14 for Blender 2.80, featuring augmented reality support (WebXR), morph target controls and a parametric models demo, font loading and texture-from-text features, normal map generator and more.Gray Olson posted the final update on her GSoC project for appleseed for which she created a unified viewport in appleseed.studio displaying several possible views of a scene, allowing to switch between them and overlay data and widgets on top of it.Jeremy HU, who also got Epic MegaGrant in July, keeps posting updates on Dust3D.New feature preview: Copy Color / Paste Color #Dust3DThanks @satishgoda for the suggestion.https://t.co/kBYlwVblzQ#gamedev #indiedev #lowpoly #3dmodeling pic.twitter.com/xELMwznMSW— Jeremy HU (@jeremyhu2016) August 20, 2019 Game design and programming Godot s: 8 Google Summer of Code students are doing fine. Here is the latest report.There s also a very much welcome update from Hugo Locurcio:A new and improved Project Manager UI has landed in #godotengine! Here's a before/after comparison: pic.twitter.com/CumuBM57lY— Hugo Locurcio (@HugoLocurcio) August 20, 2019 CAD WandererFan added an alpha version of a welding symbol editor to the master branch of FreeCAD and is looking for input from users. Some nightly builds are available.The OpenOrienteering Mapper team have been steadily releasing new development version with new features and bugfixes. Some of the changes over the summer are: mobile version for Android, experimental OCD 2018 importing and new OCD exporting (version 8-12, including georeferencing), GeoTIFF support, improved CMYK PDF exporting. Have a look for yourself and maybe give it a spin.SolveSpace is getting long overdue development love from contributors who are now taking over the project from whitequark. This is not an easy process, you probably shouldn t expect releases any time soon, but we ll see. Video Kdenlive 19.08 was out earlier in August. Some of the release highlights:3-point editing (at last!)Simple speed adjustment by Ctrl+dragging edges of the clipConfigurable number of channels and sample rate in the audio capture settingsClip transcoding re-enabledDefault fade duration is now configurableFor more information, please see release notes.New features never stop arriving to Blender Power Sequencer:Adding a new useful feature to the trim tool - Blender's vse doesn't treat gaps as selectable elements. Power Sequencer 1.4 will be there to help!https://t.co/f57tcqii5t pic.twitter.com/XkHZDCccTI— GDQuest (@NathanGDquest) August 21, 2019 Music There was an interesting discussion about UI on Ardour s IRC channel after last week s interview with Oleg Kapitonov, and the immediate result was that Robin Gareus replaced text captions with icons on buttons in plug-in windows. So when you use plug-ins with narrow natrive UIs, the dialog won t be as wide as before.<img src="https://librearts.org/2019/08/week-recap-28-august-2019/sw-ardour-plugins-window.jpg" loading="lazy" alt="Tighter plugin window UI in Ardour 6 alpha" width="100%"/> Robin keeps improving icon-related code ever since. He also bundled x42-tuner with Ardour and dropped rule-based midifilter.Meanwhile, Len Ovens resumed his work on the foldback bus. Essentially it s a software implementation of stage monitoring where an output is tailored for a performer to help them hear themselves. The new code is, um, really, really new. A lot more will follow.And yes, all this new stuff will eventually be part of Ardour 6.New version of VCV Rack is out with bugfixes and new API features. Tutorials The Blender team continues releasing videos on 2.80 features (there s a separate playlist on YouTube for that). The most recent addition is a video on sculpting tools: New kickass 1-minute Blender tutorial from Ian Hubert, this time on creating post-apocalyptic cities:Lazy tutorials #8- making a post apocalyptic city! If you use the amazing BoolTool addon, it makes the whole process way easier. #b3d #blender3d #tutorial #city pic.twitter.com/wuOH38Dpdx— Ian Hubert (@Mrdodobird) August 20, 2019 Chris Kearford posted a Non-photorealistic explosions with Blender tutorial with a few videos.<img src="https://librearts.org/2019/08/week-recap-28-august-2019/edu-chris-kearford-npr-explosions-blender.jpg" loading="lazy" alt="Chris Kearford, NPR explosions in Blender" width="100%"/> New tutorial from GDquest on using physics layers and masks in Godot: Fred Brennan posted a tutorial on changing the ascender, cap height, x-height, & descender: Ramon Miranda explains 10 tricks to paint faster and better in Krita: Art and showcases Creating UI motion graphics inside Blender 2.8 Eevee #b3d #blender #eevee pic.twitter.com/KaFOjSnfHp— siraniks (@siraniks) August 21, 2019 Blender2.8 Cycles Render Cycles Blender #b3d pic.twitter.com/TY3yT6htxf— Hirokazu Yokohara (@Yokohara_h) August 20, 2019 Great work by Philipp Urlich, made with Krita:<img src="https://librearts.org/2019/08/week-recap-28-august-2019/art-krita-philipp-urlich-tee-brothers.jpg" loading="lazy" alt="Philipp Urlich, Krita, Tree Brothers" width="100%"/> More speedpainting with Krita by Sylvia Ritter:<img src="https://librearts.org/2019/08/week-recap-28-august-2019/art-krita-sylvia-ritter-speedpainting-250819.jpg" loading="lazy" alt="Sylvia Ritter, Krita, speedpainting" width="100%"/> Financials Last Sunday, I woke up to a whopping 600 euro donation from Simon Repp. Simon works in multiple disciplines, both graphics and music. So I think I probably didn t entirely mess up by going beyond the topic of image editors and 3D :) Thank you, Simon!
The history of this project is not exactly typical. A few guys gathered in an off-topic section at the linux.org.ru forum to talk about guitar amp/cab simulation on Linux. And mostly, they were unhappy with existing options.Around page 9, Oleg Kapitonov, who started this thread, was using specmatch to make impulses from YouTube videos demonstrating clean/crunch sound difference. On page 10, he was already prototyping emulation models with Scilab…Some 50 pages later, we now have tubeAmp guitar amp emulator and half a dozen of plug-ins, all in LV2 an octaver, a vintage fuzz pedal, a tube screamer etc.I ve been watching this thread with fascination since its inception, so I guess it s time to share it with the rest of the world :) Hence, here is a quick interview with the developer.Oleg, let s get right down to it :) If I remember correctly, a good chunk of R&D was made by measuring the signal in different cascades of another guitar amp simulator (Amplitube). So that basically makes tubeAmp a derivative of a derivative project. Do you ever get nightmares how you lay your hands on a real guitar amplifier, disassemble it, make impulses and, with cold sweat running dowm your spine, realize YOU HAD IT ALL WRONG? :)No, it wasn t like that really. The project began with the idea of making a universal model of a guitar amplifier instead of emulating specific electronic circuits.If you look from the point of view of mathematics, not electronics, a typical guitar amplifier works like this: the guitar signal first passes through a certain filter, then it is limited, then again through a certain filter. That is, the model of any amplifier can be simplified to an input filter, an overdrive, and an output filter. The sound of such a model will depend on the frequency response of the filters and the overdrive curve, and can mimic any type of an amplifier a vintage or a modern one.If you want your model to produce a familiar sound instead of something completely alien, your model s options must correspond to real devices. To do this, I developed a method for obtaining these parameters with a test signal. The same approach is used in Kemper Amps, but, of course, the method there is more sophisticated and more accurate.So no, I don t have nightmares )) I tested my method on Amplitube, and the profiles in the first release were received from there (but noticeably adjusted manually, stealing the sound is not good).With a real device, the parameters will be different, but the mathematical principle of the amplifier itself cannot be different. These are just filters and a limiter. Everything that does not fit into this model are basically nuances.During the discussion in the forum, you said you weren t fond of the idea to go along the beaten track of imitating specific amps. You said you d rather make some sort of a generalized synthesizer of guitar amplifiers. Did you succeed? What s your opinion on this currently?I ended up doing something like that indeed. The tubeAmp plugin contains a universal amplifier model with which, in principle, you can receive sounds of any type. It all depends on the profile.To create profiles, I am currently developing an editor where you would be able to set all parameters manually, so that s the part that is a kind of an amplifiers synthesizer. You can also create new profile parameters with a test signal which is just a WAV file. You can pass it through a real device, through a software processor, through a model in Qucs or OpenModelica, and so on.The profile taken by the test signal can be edited manually as desired.Do you have any kind of roadmap or at least an idea in which direction to develop the project?Here is what I have in mind.For tubeAmp:Version 1.2 a more sophisticated model, with the addition of asymmetric distortion forms. They cannot be obtained right now, and thus profiles of vintage amplifiers do not sound exactly as they should. I also want to add transformer nonlinearity to the model.Version 2.0 will have a new profile format instead of *.tapf. I ll also switch to the YAML format to describe the block chain and parameters of blocks. This will make the format vastly more human-readable (and not binary, as it is now). Secondly, you would then be able to create more flexible models, including as many filters and overdrives as you like in the order that you want. That is, one tubeAmp plug-in will be able to emulate both an amplifier and a set of pedals. If necessary, the entire chain can be placed in one profile. For the profiles editor, my plan is:Version 1.0 the ability to create profiles manually or with a test signal. I ll also add a deconvolver and an automatic equalizer.Version 2.0 transition to a new profile format. This will require the addition of a structural diagram editor, such as in Simulink or OpenModelica, but simpler.Oh, and the tubeAmp hardware project is getting ready. This is a DIY guitar processor based on the DSP ADAU1452. The control program for this processor will be able to load profiles from the tubeAmp plugin into it, and it will give the same sound, plus the main plugins from the KPP set. And this is practically without signal delay, which everyone hates so much on the PC.The device is focused on working with Linux, all the software will be open under the terms of GPL. I will also open all PCB designs and make it an open hardware project. You won t need proprietary programs or drivers to use it.What stage is the DIY project at?I bought all the electronic components I need for the guitar processor. These are ready-made modules: my initial plan was that you wouldn t have to solder anything, with maybe an exception for cables between components. Now I need to write the DSP firmware and a program for PC that interacts with the guitar processor. Not difficult, but so far, I have given all my attention to the profile editor. Once I release the editor, I can go all in with the DIY processor project.At some point in the discussion on the forum you wrote this: Personally, I d rather smash my head against the wall than I admit something is not possible on Linux yet possible on other systems. Linux is not the problem, it s the people who did not write this or that program. Don t moan, write code . Now that you have the experience of writing something that was missing for you, what was it like?What I wrote is that I was more interested in the technical side of the electric guitar than improving my playing to a professional level (which is unattainable for me). If I just wanted to play the guitar, I would buy a guitar amp, that’s all. But I wanted to delve into digital signal processing and delve into it I did. It s close to what I do at work, and I ve been fascinated by electronics since I was a child. Do I think that guitar players need to write code? No, of course they don t have to.They need to play the instrument. But there are many programmers who like playing guitar, and they can make software. Just look at the crapton of VST plugins for Windows. For Linux, there were just Guitarix and Rakarrack. So I thought I d join.While working on this, I acquired very useful knowledge in the field of digital signal processing. I recently bought an RTL-SDR dongle and started studying the processing of radio signals. Turned out, it s all the same.Do you see any connection between your own playing the guitar and the quality of your plugin? Like, you played guitar a little, didn t like the sound, tweaked the code, rebuilt it, and liked the results better?Yeah, sure! I did everything to make it sound the way I wanted it to. I used to constantly compare the sound of real amps on YouTube picked up by a mic with how my plugin sounded. So I did something like what you described on a daily basis for a long time. I don t touch that code right now only because I m focusing on the profile editor. Once it s ready, adjusting the sound will be possible on a completely different level.The code that I have already released isn t perfect and raises some well-deserved questions all because I made quick and dirty profiles.All this time I ve been playing through my plugins only. I can hear my bloopers better when I do, and with decent settings, I can variate between clean and crunch easily. I think I started playing better, if I may call it that.A few months back, Hermann Meyer, the lead developer of Guitarix stopped by to send you a few pull requests to add Makefiles and suchlike. Did you maybe have a treacherous thought at that moment to stop working alone and join a larger project? :)I m very grateful to him. He actually also helped squashing a bad bug in the plugin.I don t see myself joining Guitarix, I basically started my journey by modifying its code. We discussed that in the forum too.What I think is that it s now better to make plugins for DAWs rather than make standalone applications. Plus, I see some historical problems in the architecture of Guitarix, it s an old project after all.Actually, my impression is that Hermann decided to turn everything into plugins and move all new features there. At least, he s been releasing LV2 plugins in batches lately.After all, Guitarix is, in fact, a host of such plugins, with some hardcoded design decision. Doing everything in a DAW is more flexible, in my opinion. You need it to record your guitar anyway. So why not use a DAW and some 3rd party plug-ins like the ones both of us make?How could an interested developer make him-/herself useful for your project?It is better for a new developer to wait for the first release of the profile editor, which will be soon. The editor is, in fact, the main part of the project, on which everything depends.There will be a huge field for work for every taste: from a graphical interface to complex mathematics. There s even a place for a front-end developer ))What about non-coding contributions? What kind of work is in demand?One might help using the editor. Like, pick a profile from your equipment. Or maybe fix/customize an existing profile in the editor, then upload the result into the profile repository another project I m planning.I m also lacking examples of how my plugins sound. Sure, I can record some stuff myself, but I don t play in all possible styles. If you are a Linux user and you could record yourself playing your usual guitar parts through my plugins, that would be a big help.
Week highlights: GIMP gets Lua and JavaScript support, Krita gets notifications for new versions availability, new releases of G’MIC, DJV, and Shotcut, more exciting news on sculpting tools in Blender 2.81, a bunch of great tutorials and artwork, and more. LibreArts podcast In early June, I attended Libre Graphics Meeting and did two interviews, one with Pat David and one with members of the Krita team. Unfortunately, I messed up the video part of the interview with Krita devs. So I decided to reuse the only thing I could salvage, which is the audio, and started a podcast.This is the first episode where sat with Boudwijn, Agata, and Wolthera to talk about financing the Krita project, non-coding contributions, pacing the development with regards to new features vs bugfixes, etc. I m likely to release the text version as well.<iframe width="100%" height="300" scrolling="no" frameborder="no" allow="autoplay" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/664922249&color=%23ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false&show_teaser=true&visual=true">For now, I have a few ideas what to do with this podcast, but I don t currently expect to do more than one episode a month.The podcast s name also reveals a coming and long overdue rebranding of this website. Graphics GIMP developers continue refactoring the new plug-in API, and there s more to that now:Apart from the Python 3 support, we now have officially support for JavaScript and Lua plug-ins in @GIMP_Official. They come with self-documenting demo plug-ins in GIMP itself to help plug-in creators.As usual, don't forget you can support our work : https://t.co/Id9D08K62T pic.twitter.com/T2lqE6kHxn— ZeMarmot (@zemarmot) August 16, 2019 Jehan isn t stopping at that though. Last week, he prototyped a more object-oriented API (currently in a branch and undergoing code review by Michael Natterer). Here is how Jehan explained this on IRC: It goes much further than fixing Python, that's just a side effect. It will also bring signals to plug-ins. Basically a plug-in will be able to connect to signals that the core would emit.Typically obvious things are things like "being notified when the image or the layer we are working on has been deleted". But this is only the tip of the iceberg.For instance, on our local GIMP, I have a commit to change how layers are named automatically (the #2 #3 etc.). This commit had been refused way back years ago because it changed a common thing people were used to.I would introduce a signal "new-layer" and an extension could connect to this signal and step in to change the automatic naming. Then my core commit could become a mere plug-in. It's common hooking system to customize software.A more technical explanation is available in this merge request.GEGL has finally switched to the Meson build system and removed support for autotools. yvind mentioned this in the commit message:Getting rid of autotools also allows us to continue refactoring away file names and directory layout away from constraints from autotools.He also improved how the babl library handles greyscale spaces and patched the JPEG saver accordingly. Krita developers and contributors posted more reports from the recent Krita sprint: Raghavendra Kamath, Wolthera van H vell tot Westerflier, Dmitry Kazakov.Among development news:Kuntal Majumder continues hacking on the magnetic lasso (GSoC project).Karl Ove Hufthammer is helping Wolthera with the graphic tablet tester widget and adding small UX tweaks here and there.Scott Petrovic is adding an in-app notification on availability of new Krita versions.Tilya enhanced the gamut mask feature. Now when a gamut mask is active, it is also shown in the on-canvas popup color selector.Igor Novikov refactored the look of the Preferences dialog in sK1 and posted a screenshot. Here is the old thing (keep in mind that my GTK theme is different):<img src="https://librearts.org/2019/08/week-recap-19-august-2019/sw-sk1-old-prefs-look.png" loading="lazy" alt="sK1 old preferences" width="750"/> Here is the new look:<img src="https://librearts.org/2019/08/week-recap-19-august-2019/sw-sk1-new-prefs-look.png" loading="lazy" alt="sK1 new preferences" width="750"/> G`MIC got a new denoising filter. Version 2.7.0 is now available for downloading. David is also aware of plug-in API changes in upcoming GIMP 3 and stays in the loop.DJV developers recently released a major update of their CG data/footage view and annotation tools, with more subsequent releases. Highlights: essential color management, revamped UI with HiDPI support, better DPX and Cineon playback performance, new keyboard shortcuts and mouse actions (panning, scrubbing), and more. 3D One of the most interesting bits of news last week is that the Evangelion studio is now adopting Blender 3D for production.<img src="https://librearts.org/2019/08/week-recap-19-august-2019/sw-blender-evangelion.jpg" loading="lazy" alt="Evangelion adopts Blender 2.80" width="750"/> There s a ton of sculpting tools updates from Pablo Dobarro, including a new Pose tool…Pose brushThis new brush lets you pose your model simulating an armature deformation. It is fully automatic. It does not need a rigged model, good topology, manual pivot points, transform gizmos or masks. #b3d pic.twitter.com/DGFx141xXw— Pablo Dobarro (@pablodp606) August 12, 2019 a voxel remesher The Voxel Remesher is now in masterIt introduces a new sculpting workflow without any of the limitations of Dyntopo (no geometry errors or performance penalty). It is also useful for simulations and 3D printing. #b3dhttps://t.co/Ug2MYOIcYR pic.twitter.com/NcB5ChfE92— Pablo Dobarro (@pablodp606) August 14, 2019 ...dynamic mesh preview The dynamic mesh preview and the grab active vertex option are now available in the sculpt-mode-features branch #b3d https://t.co/jIuBNN64SB pic.twitter.com/rzCkSLzQth— Pablo Dobarro (@pablodp606) August 12, 2019 ...and more. By the way, the HardOps add-on already supports some of these new features.The new file browser, being worked via Google Summer of Code program, is shaping up nicely too, as Bill Rey reports:<img src="https://librearts.org/2019/08/week-recap-19-august-2019/sw-blender-new-file-browser.jpg" loading="lazy" alt="New file browser in Blender 2.81" width="750"/> Meanwhile, Intel s OpenImageDenoise is now available as a compositing node in Blender. Grab a nightly build or wait for the 2.81 release.Stuart Attenborrow developed a photogrammetry add-on for Blender (2.79 and 2.80), see this post on BlenderNation for details and grab it on GitHub.Arnaud Couturier released SceneSkies 1.2 which is a HDRI manager for Blender, now coming with version 2.80 support. The goal of SceneSkies is to make HDRI-based lighting in Blender easy and fast.CADKartik Kumar recently posted an update on his GSoC project where he s adding hardware-accelerated rendering to LibreCAD. You can test his code by building it from his personal temporary fork on GitHub.Qingfeng Xia recently resumed his work on documentation for people interested to write modules for FreeCAD. He s now covering the moving target known as version 0.19 :)VideoNathan Lovato submitted their (GDquest) Power Sequencer VSE add-on for inclusion to Blender 2.81. We ll see how it goes!MattKC posted a huge official project update on Patreon regarding ongoing refactoring efforts. Do check it out! Highlights:New flexible node-based pixel pipelineOpenColorIO everywhereBackground tasks display and managementMultiple decoders/encoders supportCaching/Rendering engine plannedDan Dennedy released Shotcut 19.08, featuring workflow improvements for playlist editing and video stabilization. Someone also contributed 360 video plugins for Shotcut to convert projections, rotate in 3D, stabilize, and punch-out a normal rectangular view.TutorialsIan Hubert won the Internet last week with this 1 minute long (!!!) tutorial on animating huge crowds in Blender. In a post at 80 Level, Gesy Bekeyei explained the production of his hard-surface project Lobster made fully in Blender.<img src="https://librearts.org/2019/08/week-recap-19-august-2019/tutorials-blender-gesy-bekeyei-lobster-submarine.jpg" loading="lazy" alt="Gesy Bekeyei, Blender, Lobster submarine" width="750"/> Learn to model donuts with Blender 2.80 and Eevee: This quick guide will walk you through the process of texturing with ArmorPaint 0.6: Dimitar from UH Studio Design Academy started a tutorials series to introduce FreeCAD to architects and Revit users. Here is the first video, and you can find the rest in this thread on FreeCAD s forum. New FreeCAD tutorial on drafting steel external stairs using Sketcher and Dodo workbenches. Art and showcases Bassam Kurdali says Wires for Empathy is still an ongoing effort:Some wires for empathy news: The project is *not dead* - All character animation is complete, and I'm currently one-person-ing a bunch of complex vfx animation; Then it's time for final lighting and render.— bassam kurdali (@bkurdali) August 13, 2019 Filipe Lima Botelho posted some renders from his recent interior design project made with Blender 2.80 and rendered with Cycles.<img src="https://librearts.org/2019/08/week-recap-19-august-2019/art-blender-filipe-lima-botelho-beach-house.jpg" loading="lazy" alt="Filipe Lima, Blender, Botelho Beach House" width="100%"/> Playing with motion blur and strange procedural citiessrc: https://t.co/1lYvGUOdEN (libs/scenes/procedural-city.js)#threejs #javascript #3d #generative #shader #glsl #webgl #generativeart pic.twitter.com/jHEi2JzfgR— Domenicobrz (@Domenico_brz) August 15, 2019 Sady Fofana rendered this cyberpunk urban scene with Eevee:<img src="https://librearts.org/2019/08/week-recap-19-august-2019/art-blender-sady-fofana-street.jpg" loading="lazy" alt="Sady Fofana, Blender, cyberpunk street scene" width="100%"/> Felipe Del Rio posted a Cycles render he did for CG Masters Ice Cream Shop challenge.<img src="https://librearts.org/2019/08/week-recap-19-august-2019/art-blender-felipe-del-rio-Ice-cream-shop.jpg" loading="lazy" alt="Felipe Del Rio, Ice Cream Shop" width="100%"/> Interested in 4-cylinder steam engine porn? FreeCAD community member un1corn posted some! :)<img src="https://librearts.org/2019/08/week-recap-19-august-2019/art-freecad-un1corn-4-cylinder-steam-engine.jpg" loading="lazy" alt="un1corn, FreeCAD, steam engine" width="100%"/> New Inkscape artwork from Sven Ebert:<img src="https://librearts.org/2019/08/week-recap-19-august-2019/art-inkscape-sven-selfie-classroom.jpg" loading="lazy" alt="Sven Ebert, Inkscape, Selfie" width="100%"/>
The interview with Boudewijn Rempt, Agata Cacko, and Wolthera van H vell tot Westerflier was taken on 1 June 2019 at Libre Graphics Meeting in Saarbr cken, Germany. We spoke about the financing of the Krita project and non-development contributions that have proven to be most helpful for the project.<iframe width="100%" height="300" scrolling="no" frameborder="no" allow="autoplay" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/664922249&color=%23ff5500&auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false&show_teaser=true&visual=true">
Week highlights: great new features in GIMP and Krita, new digikam release, better sculpting tools coming to Blender 2.81, GSoC project in Pitivi bringing better UI and features. Graphics GIMP developers have been busy refactoring everything related to plug-ins and the procedure database (PDB). More interestingly, Ell started working on automatic expansion of layers when using transformation tools.The initial implementation adds two new features:New Image mode available next to Layer, Selection, and Path modes in all transformation tools. This mode works best for single-layer images and automatically resizes the canvas when the bounding box of the transformed image is larger than the canvas.New Image > Transform > Arbitrary Rotation command that launches the Rotate tool in the Image mode.So here is before and after: The next step is to figure out some things about the automatic expansion of layers. E.g. in some cases the bounding box of a layer can become smaller after rotation, and it is unclear how GIMP should handle that: should there be two separate options for expansion and contraction? Should contraction be suboption of expansion? Should GIMP just always expand and contract?This is where you can actually help by pinging the @GIMP_Official account on Twitter and giving some argumentative feedback based on your use of other programs and common sense :)GIMP has also moved to using Gitlab s continuous integration system, and GEGL will be using the Meson build system in the next release (coming soon).Pavol Rusnak created a GauGAN plug-in for GIMP, that creates photorealistic images from segmentation maps (see the original page).Krita developers had an almost week-long sprint in Deventer (Netherlands), and it looks like it was their largest attended sprint so far.Im in Deventer! Here is an attempt of a group photo at Krita sprint 2019 pic.twitter.com/ghzTrLMKna— David REVOY (@davidrevoy) August 6, 2019 In terms of code commits, it wasn t exactly a programming sprint. My gut feeling, however, says that given the number of artists who attended, we are likely to see useful changes piling up soon as the result of conversations.Some great things are happening already. For instance, Boudewijn started adding action search in a dedicated branch, and Wolthera started adding a little helper widget to calibrate pen pressure for graphic tablets. There s still UI to be added, but this is likely to land to version 4.3.0 too. Meanwhile, Kuntal Majumder continues his work on the Magnetic Lasso tool.PhotographyThe digiKam team released a maintenance update, version 6.2.0, that brings wider camera support and better metadata support via new libraw and Exiv2 releases respectively. Icon also get rendered properly on HiDPI displays now. For full release notes, see here.Jean-Christophe recently started working on a Spot Removal tool in RawTherapee. It s already functional and lives in a dedicated branch for now.You don t absolutely have to use it on skin blemishes. If a particularly nasty bird flew into your landscape shot and the shutter speed wasn t fast enough to render it sharply, the tool will do just fine. Enable the tool in the Detail tab, click the little pencil button to enable spot drawing mode, then Ctrl+Click on a point, drag immediately to set the reference, drag inner circle to set radius, drag outer circle to set the fuzzyness of the selection. 3DPablo Dobarro continues improving sculpting tools in Blender. Follow him on Twitter for more updates.Sculpting with modifiers can be difficult because you can't see where the real vertices are. This new cursor mode dynamically displays the real mesh. It also isolates disconnected mesh components and it applies the maximum brush strength to the active vertex. #b3d pic.twitter.com/AQ7GaU7DE9— Pablo Dobarro (@pablodp606) August 8, 2019 CADYorik van Havre published his July report on BIM improvements in FreeCAD. Highlights:Various BIM Views improvementsThe total thickness (the sum of the material layers) is now displayed on the MultiMaterial task panelBuilding Parts are now able to clip the 3D viewMulti-view grids and working planes have been refactoredVideoFFmpeg 4.2 "Ada" us out with the usual crapload of changes including AV1 decoding support. See the news section for all the details.Thibault Saunier did a talk about OpenTimelineIO support in GStreamer at SIGGRAPH (see their earlier post on the initial implementation): For Pitivi, this means you would be able to open project data from other NLEs, mostly commercial ones. There s news from their GSoC students too. Millan Castro, who works on markers, recently posted an update on his progress. So did Yatin Maan, who creates a new UI for effects library. Another interesting subproject is the support for nested timelines in GStreamer Editing Services added by Thibault Saunier and contributed to by Swayamjeet Swain.Alexandru B lu also says he s going to merge scaled proxies branch from last year s GSoC soon.There s also a beta of the upcoming Shotcut 19.08 release available, with enhancements and fixes.MusicMixbus developers continue contributing to Ardour. Most recently, Ben Loftis merged his rework of the regions list, with intention to make it more usable for work. In a nutshell:New Sources tab features source data with Take ID (time of creation) and original position (on the timeline). You can now do things like selecting all regions coming from a single take (commonly lots after you split and rearranged their locations), sorting clips by time of creation etc.Pre-existing Regions tab now features the tagging of clips. You can tag clips by any criteria you can think of ( good/bad , left/right mic etc.) and then easily locate all of them. This will be available in Ardour 6.Also, Nikolaus Gullotta is working on mixer snapshots (they are mixer presets really). The idea is that track templates are not sufficient, you might want taking a complete mixer setup from one project and apply it to another project. This is currently sitting in a dedicated branch, so far it s unclear if it will make it to Ardour 6. TutorialsMart’s Struggle with Drawing brought us this little tutorial on adding reference images in Krita 4.2.x: UkrArtDesign has a new tutorial on drawing SVEN speakers with Inkscape: Boris Hajdukovic insists he s not making darktable tutorials. But you can still learn a thing or two from his videos that typically feature advanced blending of various filters: Andrew Price shot another video on Blender 2.80, that now focuses on one question: WHERE DID MY BUTTON GO TO? Art and showcasesThe award winning short film "The Box" by Du an Kastelic, and it s made with Blender! New Cycles render from Jakob Scheidt comes with the-making-of post at Blendernation: New artwork by Sylvia Ritter from her ongoing Tarot cards project: Great concept design for a game, by Jimi John:
Week highlights: major changes in GIMP, exciting new features in Krita, new releases of Luminance HDR, Exiv2, Scribus, and FontForge, gazillion of news in the Blender department, new ArmorPaint releases.This is a mix of things that happened last week and things that happened since the last recap. Due to workload, weeklies have become more of monthlies for now. I also have to admit that so much has happened since the last one that I m struggling to cover it. Which means there will inevitably be things I m not mentioning. The usual way to improve that is by telling me on Twitter that I forgot something :) Graphics The GIMP team very nearly released 2.10.14 and 2.99.2 last week. That s right, developers were one inch away of cutting a new stable version as well as the first unstable release leading up to GIMP 3.What happened is that they started updating how GIMP uses the wire protocol for plug-ins, adding GObject introspection, and porting GIMP to Python 3. The intention is to fix various architecture flaws accumulated over the years, port GIMP to use an actually maintained version of Python, and, coincidentally, open the gates for using more languages to write plug-ins (JavaScript, anyone?).The whole thing used to be the proverbial axe hanging over their heads. The last time I saw them discussing this about a year ago, they didn t know how much it would delay the release of GIMP 3 and wondered if they should postpone this until after GIMP 3.0 release. As you can see, they took the plunge.A week into the coding sprint, the foundation work seems done now, several Python plug-ins that GIMP ships have been ported to Python 3, but there s more work to do. Much like any refactoring effort, the port also revealed a number of serious issues in the code that need to be taken care of. Most of the work is done by Michael Natterer and Jehan Pag s. All of this will only be available in GIMP 3.0 (and 2.99.x dev releases).Jehan also reported that he dropped the animation plug-in for GIMP that he worked on in the past. He s now implementing animation as a core feature and is considering making it a timeline rather than an x-sheet based solution.Meanwhile, yvind Kol s made quite a splash on Twitter with his experiment in color assimilation grid illusion that he did in GEGL. The illusion exploits human vision inability to perceive colors as they really are due to neighborhood colors.So here are the original colors:<img src="https://librearts.org/2019/08/week-recap-4-aug-2019/sw-gegl-grid-illusion-unprocessed.jpg" loading="lazy" alt="Grid illusion, original colors" width="100%"/> Here is the same image converted to a greyscale image with a color grid on top of it.<img src="https://librearts.org/2019/08/week-recap-4-aug-2019/sw-gegl-grid-illusion-processed.jpg" loading="lazy" alt="Grid illusion, processed image" width="100%"/> The actual data is in shades of grey (see the color samples docker), but the color grid makes us believe we are looking at a full-color image.<img src="https://librearts.org/2019/08/week-recap-4-aug-2019/sw-gegl-grid-illusion-close-up.jpg" loading="lazy" alt="Grid illusion, close-up" width="100%"/> The operation is already available in GEGL and is likely to be accessible in the next GIMP release via the GEGL tool (unless the team finds a good place for it in the menu).The original image from the experiment went rounds and rounds on social media, mostly with the Creative Commons watermark cropped (which makes little sense, since it explicitly allows sharing). yvind ended up replacing it in the original post on Patreon for legal reasons. For more information about the illusion, see his second post.Both GIMP and babl/GEGL projects are modernizing their infrastructure too. This involves switching to the Meson build system and Gitlab CI. The babl library already has both, GEGL only has Gitlab CI support and might get Meson for the next release, and GIMP has the Gitlab CI switch as an ongoing effort, with some outlook for a future Meson port as well.The Krita team released several follow-ups to their extremely anticipated version 4.2. It s mostly bugfixing, but they backported one new feature from the development branch: rotating the canvas with a two-finger touch gesture.The master branch is indeed where most fun is happening: a new SAI-like Luminosity blending mode (Agata Cacko), EXR support improvements (Boudewijn Rempt), new Windows-specific "Software Renderer" option for OpenGL engine when a GPU is too old for Direct3D 11+.The most exciting new feature for me though is the new Snapshot docker added by Tusooa Zhu. It allows you to create variations of the same work and switch between them freely. The concept should be pretty familiar to DAW users.Tusooa is actually one of Krita s GSoC students this year, his/her main objective is to modernize the undo system and allow things like Photoshop s History Brush. Another student, Sharaf Zaman, is working on an Android port that is already functional (that s where touch rotation gesture in 4.2.x is coming from). Kuntal Majumder has a dedicated branch for his work on the Magnetic Lasso tool (will be merged to master eventually). Alberto Eleuterio Flores Guerrero also has a dedicated branch where he hacks on making it possible to use an SVG file as input for brush engines.The team also recently revived their YouTube channel with help from Ramon Miranda. And you might want checking out their recent post on how Krita gets developed.As for MyPaint, earlier this summer, the flood fill feature by Jesper Lloyd got merged into the upstream project after sitting for quite a while in a GitHub fork. Meanwhile, Brien Dieterle has been hacking on bump mapping (better rendering, more blending modes support) and OCIO/filmic in his private spectral_log branch. Per-layer texture settings. As a compositing operation it doesn't touch the layer data, so you can change the background or settings any time. It's fun to load old paintings and add a canvas effect. Finally a screenshot of the terrible interface WIP. @MyPaintApp @lgworld pic.twitter.com/xjgYih6QPF— Brien Dieterle (@BrienDieterle) June 17, 2019 Inkscape developers continue to viciously fix bugs for the upcoming v1.0 release. At this stage, most of the work is under the hood, so there isn t a terrible lot to tease you with. The team also launched their official forum at https://inkscape.org/forums/.The Scribus team announced the release of version 1.5.5. Mostly, it s under-the-hood changes, including fixes for the new text engine to improve the support of complex scripts. One new feature (among others) you might like is an action search dialog, similar to that of Blender, GIMP, Olive etc.<img src="https://librearts.org/2019/08/week-recap-4-aug-2019/sw-scribus-1-5-5-action-search.png" loading="lazy" alt="Action search in Scribus 1.5.5" width="100%"/> As you can see on the screenshot above, it works a little differently though and lists all actions in the Page menu, not just menu entries that contain the word page .For information on other new features and download links, see the release notes.Jeremy Tan released FontForge version 20190801. On the outside, it s a minor update featuring user decompositions and a Croatian translation. Under the hood, however, it s the result of cleaning up the code base and dropping Python 2 support in favor of Python 3. Photography Luminance HDR 2.6.0 was released in June and features 4 new tone mapping methods (Ferwerda, KimKautz, Lischinski, and VanHateren), faster tonemapping (patches coming from one of RawTherapee s developers), gamma adjustment and saturation at the post-processing stage (after tone mapping), and preview in the HDR wizard.Ever since, Franco Comida has been hacking on integrating the librtprocess library into the program (in a dedicated branch on GitHub for now). I m not sure if Luminance HDR would benefit from using chromatic aberrations fix feature (or highlights recovery by inpainting), but advanced demosaicing methods are going to be useful for those loading raw files directly into the program.Robin Mills announced a new release of Exiv2, featuring Nikon/AutoFocus and Sony/FocusPosition Metadata, revised documentation, bugs and security fixes, as well as new and updated translations. If you are unfamiliar with the project, Exiv2 is something you typically get as a dependency for most decent photography applications on Linux, and something you don t know you are using on Windows and macOS versions of said software.Klaus Ethgen released a new version of Geeqie, a photo-centric image viewer for Linux. Among new features: initial support for Hi-DPI-aware rendering of images, better multi-display full screen mode, star rating display, search with regular expressions. 3D and VFX Where do I even begin?A mere week apart, Epic Games announced supporting Blender Foundation with a $1.2 million grant, then Ubisoft announced that they will join the Blender Foundation s development fund as a corporate Gold member ( 30K/year). It was a bit too much for passionate users who got suspicious, so Ton had to issue this statement:I have carefully constructed the Blender organization to be independent, with distributed copyright and firmly rooted as Free Software. We are strong enough to accept the industry to come on board. Welcome to our community, Epic Games, Ubisoft, Tangent, and many more! #b3d— Ton Roosendaal (@tonroosendaal) July 22, 2019 What does it all boil down to?Epic have a $100 million worth MegaGrant program and they approached Blender Foundation with a $1.2 million grant. The foundation asked Epic to deliver the MegaGrant incrementally over the next three years to ensure continuity of the work that is planned to be done:Here is what s changing: better development coordination, technical docs, onboarding and support for new developers, implementing code standards and better engineering practices, improving the Phabricator platform for both users and developers, establishing online support group etc. Essentially, the money will be spent to make Blender more professional.Having Ubisoft as a corporate supporter means two things: direct funding that allows hiring more developers (the Blender Fund now gets over $83K a month), and developers from Ubisoft assigned to improve Blender. Additionally, Ubisoft Animation Studio will use Blender for their productions.The foundation has already started hiring. Pablo Dobarro is joining the team to work full-time on Blender s sculpting and painting tools.I'm refactoring some sculpt mode tools to use a new mesh API. All features in the sculpt branch should now be compatible with dyntopo and multires. As soon as the API is ready it should be possible to merge everything to master without any major problem. #b3d pic.twitter.com/I3NAmsflba— Pablo Dobarro (@pablodp606) July 26, 2019 Then Ton Roosendaal won the JPR Technology Advancement Award, and Pablo Vasqez got a brand new laptop with mad specs from Boxx thanks to Intel s intervention because all Pablo s gear was stolen from him in the US during SIGGRAPH.No words to describe this. Our friends at @intel heard about what happened and connected me with @boxxtech and they immediately helped out. An insane Intel i9, Quadro RTX, state of the art for computer graphics laptop #b3d #boxx #RTXON #isthisreallife pic.twitter.com/TVFHAly2yg— Pablo Vazquez (@PabloVazquez_) August 2, 2019 And the last important bit of news here is that the Blender team released much anticipated version 2.80. No weekly report is large enough to cover all changes. You can get a pretty good idea from official release notes and a video by Andrew Price. In other 3D news, Lubos Lenco released an update for ArmorPaint featuring curvature baker, 16/32-bit painting, EXR exporting, undo for layer operations, custom keymaps, 256 layers per project. He s also adding some edge wear materials based on curvature baking.<img src="https://librearts.org/2019/08/week-recap-4-aug-2019/sw-armorpaint-edge-wear.jpg" loading="lazy" alt="Edge wear in ArmorPaint" width="100%"/> Finally, Fr d ric Devernay revived the Natron project and is releasing updates a few times a week now, mostly with bugfixes and UX improvements. Check them out on GitHub. CAD FreeCAD continues steadily progressing. Here are some of the highlights from earlier this summer:A whole new layers system: sets object styling to layer s children and is independent from groups. In the Draft workbench, enable it via Draft -> Utils. In the BIM workbench, there s now a Layers Manager tool that works just as you would expect it, if you ever used another CAD system.Finer control over IFC structure export to work around missing features in the IFC spec.Selective IFC importing via the IFC explorer, where you can pick parts of the model to import.The revamped addons manager now displays more information about each add-on that you can install.The Render workbench now supports Cycles, which has been Yorik s plan all along ever since he started developing this workbench a few years back.For a more complete overview, see this Patreon post by Yorik van Havre. He s likely to post a July overview soon enough.Also, an engineer known as RealThunder in the FreeCAD community is now also on Patreon. He (she?) makes releases of FreeCAD + Assembly3 for both Windows, macOS, and Linux (AppImage). You might also like having a look at this thread on FreeCAD s forum for some background and additional info, although I realize you might have to resist reading all 134 pages of it, for sanity s sake.Finally, Kurt Kremitzki posted an update on FreeCAD PPA improvements and Debian s Science Team packaging changes.In other news, Open Cascade released CADRays under the terms of MIT license. It s a GPU-accelerated unbiased physically-based renderer that works on both AMD, NVIDIA, and even integrated Intel GPUs. So far, it looks a bit like a code drop: there have been no changes in the Git repository since the code publication. Video Dan Dennedy did a few releases of Shotcut. The most recent one features a bunch of bugfixes, a drop-down list of common frame rates for Export and Custom Video Mode, and a HD 1080p 50 fps video mode.The Kdenlive team released an update in July with bugfixes and minor improvements. Their GSoC student Akhil K G has been steadily rewriting the titler back-end. You can find his weekly reports in the project s blog.Sybren A. St vel reports that Blender 2.81 will feature support for the WebM video container, alpha channel support for VP9 video, and the ability to write Opus audio.Olive s MattKC has been spending most of his time rewriting essential bits of code. We asked him for details, and this is what he replied:The main purposes of the rewrite are two-fold: First is I realized the new redesign plans would require sweeping rewrites anyway. The timeline, effects system, and rendering pipeline as a whole would all need to be largely reworked and at that point that's pretty much the whole application! It was looking like more than half of the app would need to be rewritten anyway to pull this off.The parts that remained, mostly older legacy code, weren't really worth keeping. The truth is Olive started largely as a learning/passion project and now that we're taking it more seriously, I knew it deserved some better foundations.Hence the second goal of the rewrite, better code quality. Making sure the code is well documented and logical from the ground up to help contributors as well as just streamline development. Additionally, since the code is becoming much more modular, we'll have the option to spin off parts of code as libraries if we ever decide to.While we've definitely switched gears from the rapid development early this year, I'm actually very excited about what's to come. I think this is a step in the direction of a super stable and extremely powerful NLE for everyone. Some of the stuff we have planned isn't even in commercial NLEs. We may still be a few months away from a usable version again, but I believe Olive will be coming back better than ever.At this point, the master branch is quite unusable, so if you are thinking of upgrading to the latest and greatest, I would advise sticking to either latest release from May 2, or a carefully picked checkout from the master branch from the second half of May, before all the architectural changes (use git checkout HASH for that). Music-makingThe most interesting release over the past month here was much anticipated VCV Rack 1.0. Release highlights are polyphony (up to 16 voices), MIDI output (as well as CV-GATE module for drum machines and CV-CC module for Eurorack), easy MIDI mapping, new visual module browser, a multi-core engine etc. For more details, see https://vcvrack.com/Rack.html. Dev version of Ardour recently got support for Contour Design s ShuttlePRO v2 and ShuttleXpress control surfaces (contributed by Johannes Mueller), as well as for Behringer s X-Touch and X-Touch Compact (contributed by Todd Naugle).On top of that, Robin Gareus improved the semi-forgotten headless version of Ardour, and Damien Zammit updated the ProTools importer. Robin also enhanced the Stem Export dialog to allow excluding muted and hidden tracks.More interestingly, he contributed a basic PulseAudio back-end, just for stereo playback for now (you can also launch pavucontrol from within Ardour).Rui Nuno Capela released Qtractor 0.9.9 with tempo/beat-detection support in the Clip > Tempo Adjust… dialog. Other changes involve bugfixes, bumped dependency on Qt 5.13 and newer, as well as asking for a new filename whenever the session file original sample-rate differs from the current audio device engine.Rob van den Berg et al. made the first release of Ninjas 2, a rewrite of the Ninjas sample slicer. You get it as both a standalone application and LV2/VST plugins. It s one of those cool projects born out of necessity and doing just what they need to do.<img src="https://librearts.org/2019/08/week-recap-4-aug-2019/sw-ninjas2.png" loading="lazy" alt="Ninjas2" width="100%"/> Tutorials New Inkscape tutorial from Nick Saporito: And a new GIMP tutorial by Davies Media Design: Steve Lund published a new Blender tutorial on adding CGI characters to live footage using camera tracking, masking, compositing layers: One of many new Godot tutorials by GDQuest team: Showcases This isn t a painting, it s actually a Cycles render by SergOrc, from his Aladdin fan art series:<img src="https://librearts.org/2019/08/week-recap-4-aug-2019/art-blender-sergorc-carefree-childhood.jpg" loading="lazy" alt="Aladdin fan art, Cycles render by SergOrc" width="100%"/> If you love low-poly art, check out this Cycles render by Burak G k:<img src="https://librearts.org/2019/08/week-recap-4-aug-2019/art-blender-burak-gok-candy-shop.jpg" loading="lazy" alt="Low-poly Cycles render by Burak G k" width="100%"/> A small teaser from David Revoy, made with Krita as usual:<img src="https://librearts.org/2019/08/week-recap-4-aug-2019/art-krita-david-revoy-starry-night.jpg" loading="lazy" alt="Starry Night by David Revoy" width="100%"/> And a new landscape drawing made with Inkscape, by Ozant Liuky:<img src="https://librearts.org/2019/08/week-recap-4-aug-2019/art-inkscape-ozant-liuky-landscape.jpg" loading="lazy" alt="Landscape drawing with Inkscape, Ozant Liuky" width="100%"/>
While I do owe you quite a few weeklies by now, I’m using this Friday afternoon as an opportunity to talk about something important.A couple of months apart, there were two cases where FOSS or FOSS-related organizations used non-free software to produce content.The first case was people using non-free graphic design tools to make some of the graphics for the Libre Graphics Meeting 2019 website.The second case was GNOME Foundation getting a contributor to do the design of the Foundation’s annual report which ended up being done with Adobe InDesign for Windows (the contributor was supportive of GNOME’s mission). @gnome Foundation is not Using GNU/Linux or GNOME for creating a report. They are using Windows. Wow Fantastic development in Free software. Here We are using @scribus for these things. pic.twitter.com/PUBeudBLRG— ranjith siji (@ranjithsiji) June 30, 2019In both cases, people were pretty much using non-free software to help the free software cause. This hits home a bit, because I’ve been using Sublime Text as one of my main production tools for the past 8 years or so and wondering at times if I’m a hypocritical asshole.The LGM/GNOME situations are, in fact, quite a bit similar to the one we had a few years back when Andrew Price said he looks at a ton of artwork produced with DCC (digital content creation) tools other than Blender, because he feels he can learn a lot from a wider community of artists.So, here’s my point: the important part is how we treat the opportunity to use proprietary tools or talk to people who do so (in fact, starting with whether we think it’s an opportunity at all).We can get people to use non-free tools to make a contribution to free tools to get them started, then see if they can do the same with free tools, and proceed from that.We can get people to use free software and then tell us where they think it fails and why, then figure out the next step. A bug can be fixed. A missing feature might just click into the place, but sometimes it’s just architecturally alien and can’t be done without doing a ton of rewrites (and then the rewrites may or may not be justified).We can try to adapt proprietary techniques to free software and learn something new in the process or, at least, make useful bug reports.We can look at art/music produced with Photoshop, Houdini, ZBrush, Logic, ProTools etc. (we do it anyway every time we go to watch a movie), get triggered by something we see or hear and then have new ideas for entirely different things we can try to accomplish using free software.In other words, all the information we get through our five senses is part of a learning process. We can use it to do better things. You don’t have to agree with that though! Past me, ca. 27 years ago: "What's this Emacs thing?" <pokes around> "ZOMG yes. Invest now -- will pay off for rest of life!"Past me, ca. 15 years ago: "What's this Facebook thing?" <pokes around> "Whoa, no thanks. I'm outta here."Present me: "Thanks, past me! Good calls!"— Karl Fogel (@kfogel) July 3, 2019So while we can cherish the idea of free/libre software and be as strict as RMS (is it even possible?), at the end of the day the point is to grow as artists, whether professionally or not, and have fun when possible. My takeaway from the LGM/GNOME story is: be flexible, allow some blasphemous and ungodly things to happen for better GNU things to come, and never stop learning.Meanwhile, one of the positive outcomes from the latest controversy is that the pdftag project by Adri Arrufat got some attention. It’s a nice little app written in Vala and GTK that does just one thing: the editing of basic metadata fields in PDF files. Use it wisely! :)
Interview with Pat David at Libre Graphics Meeting 2019
At Libre Graphics Meeting 2019 in Saarbr cken, I had the fun and privilege to do a quick interview with Pat David, founder of Pixls.us and contributor to GIMP. We talked about how his project evolved and where he thinks we are with free/libre photography tools presently.This was essentially the second video interview I ever did. The first one (with Krita team), done a few hours prior to that, was a complete flop (technically), and this one is only marginally better. So, sorry about me being this awkward dude. I will do better :) If you prefer reading text, below goes the rough transcription.I’m joined today by the magnificent Pat David from Pixls.us, also from GIMP, also known as Pat Davis.Yes, all hail Pat Davis.So, Pat, Pixls at the beginning and Pixls right now are two significantly different projects because when Pixls started, it was kind of expected to be this huge host of excellent GIMP tutorials and today it looks slightly different than that because it’s like this huge big discussion board. So why do think that happened?That’s funny, I think you were even on the original slide where I had talked about making better GIMP tutorials and as being a reason for building Pixls.us.I think it changed because the community that began to grow up around it needed a place and a way to communicate with each other, share images, talk about how they do things, and so a natural extension of the website originally was to then produce a forum for everyone to get together.And, you know, as most things happen, if we have a lot of people speaking with each other and contributing on a forum, it will tend to grow much faster than the way the site itself because they have way more time to make comments and posts. Then I have to write tutorials and articles to keep up with them. So that kind of grew on that side very much.So Pixls has a kind of an acquisition thing going on. I mean darktable, Filmulator, G’MIC, and other projects basically use your discussion board as the main discussion board for themselves. So what’s your dark deep secret of getting them?A dark deep secret is make friends and be kind.Be kind.That’s all you need to do.Kindness.Yes, very much. I think, as you know, we built Pixls so that these projects wouldn’t have to worry about that infrastructure themselves.Nobody that makes a free software image editor starts it so that they could have to moderate a forum about it. That wasn’t really their first intention.But we as part of the community can write, we have people that can moderate things and can engage with the community and they don’t have to take time away from the developers of the programs themselves and the projects themselves.Those folks are best spending their time developing new features or fixing bugs, not moderating comments or dealing with discussions that happen.So we offered it to them, we said: “Hey, I know David Tschumperl , or Jo at darktablem and Tobim and said: “If you want this, it’s here, and we’re happy to manage all of that stuff for you, so you don’t have to”. Then they came aboard.So I think between that and being kind and friendly and nice whenever possible they came on board. Yeah, it’s been great.Awesome! So, one thing I’ve noticed is that discussions on, well, generally most Linux user groups and free software user groups tend to be quiet a lot on the technical side. Which is a fine thing because we need this [software] to work as expected. But on the other hand, do you think we have enough artistic discussions? In fact, do we know how to encourage them to appear? Do we know how to moderate these discussions?Yeah that’s a great question. I don’t feel like we have enough artistic discussions happening right now. But it tends to be very technical because the users tend to be very technically inclined.But we have slowly at least on the Pixls forum started to have some interesting discussions, for instance, around certain photographers in their work.We had one about Eggleston and his ability to find beautiful images in the most mundane places. And it sparked a nice discussion about what about his images were interesting visually and what things he might be doing.And then we have them for a lot of landscapes and macros and portraits and things like this, and we try to engage that and it’s not too hard. That one’s actually the easiest: find something you love.I love Dan Winters as a portrait photography, so there’s a great discussion what is it about Dan Winters portraits I like. And you can very easily begin that discussion on the site and let people kind of give you feedback and tell you what they’re thinking. So that kind of thing just needs somebody to occasionally post about something they love and to ask others what they might think about it as well.You spend a lot of time on Pixls and you mostly talk to people who are already using free software. They are more like veterans who have been around for a long, long time. So you might not get the full picture of what professional users might want from software. How do you deal with that?That’s right. What we try to do is reach out to a cross-section of those people as much as we can.The professional users don’t spend a lot of time on the forums because they’re busy being professionals, earning a living, booking new jobsm and doing that kind of work.But you do occasionally get them to come, and it’s nice because I find if you spend a little bit of time to engage with them if they have the time they are gracious with it, so they’ll give you an answer. They’ll tell you what they think and what their needs are. And if you’re an advanced enthusiast or amateur that can engage with them as well, especially on the forums, you get a chance to really be able to say you know why is that, what things are happening herem and how can that affect what I might be doing as well.So I get to learn basically from from them, from what they need.Having spent a lot of time in conversations at both Pixls and sites like dpreview.com, what is your understanding of where free software for photography is shining and where it’s still lacking? What’s the big picture in your opinion?I think that we are in a nice Renaissance of tools that is happening right now.The raw processors for example are getting much better, much faster, and much more available, and, most importantly, much more approachable for people to use.Long gone are the days of UFRaw, where I had to really kind of know what the heck your UFRaw’s controls were telling me and how they worked. It was very unintuitive but now we’ve got RawTherapee, we’ve got darktable, we have Photoflow, we have Filmulator.We have these great projects that are providing new ways of approaching the software or approaching making images and new tools to let you really kind of feel what your creativity can do, not really bounded by the technicality anymore.So that’s a fantastic aspect of it. That, and brand new features and capabilities in GIMP that are far beyond what we would have had even six years ago is fantastic. This is a very good thing.The places where we lack I think our outreach very much. And we’re especially lacking in good quality content to demonstrate to people that these are extra extremely capable projects. And with the exception of maybe some slightly convoluted workflows, which is a nice way of saying “I don’t want to go through 40 steps to do the same thing I can do to in a commercial software”.But we can get there, and it may take a little more work but I feel the benefits far outweigh the cons.That is the freedom in the software and the freedom in never ever losing my files. I can go back to my GIMP files from 12 years ago and I can open them without a problem and get right back to work on them if I want. And I don’t if I could do that in another project.So we are lacking quality material, we’re lacking public outreach and we’re lacking people, making big noise about making cool projects and images with free projects.But this will change?Hopefully. That’s what we’re trying to do.Thank you for joining me!I’m so glad. Thank you very much!
Layers of light and dark are a tremendously important technique in composition. They help you both lead the eye and to make sure individual objects can be seen separate from each other. This lesson discusses these techniques in broad terms, and gives a number of suggestions on how to use layers of light and dark to make your imagery stronger compositionally.https://www.youtube.com/watch?v=Q6q0-bERSFU
In composition theory, a more pleasing composition is a composition that contains contrast. Now by contrast I don't mean only the contrast control in photoshop, although that can be used to give us the kind of contrast we're discussing.I'm talking about the original definition of the word: "Contrast: The state of being strikingly different from something else, typically something in juxtaposition or close association."So contrast is two opposing things. Like black and white is a contrast. Big and small is a contrast. Soft and hard is a contrast. Loud and quiet is a contrast. Thick and thin is a contrast. Textured and smooth is a contrast.A very important topic in composition theory, and available for the first time as a youtube video, this tutorial will discuss the concept in detail, and it will lay the groundwork for a number of upcoming composition tutorials I have planned in the future. https://www.youtube.com/watch?v=ocuFrOD1qEc
Just posted a video lesson showing how to use my old script Wiremaker (aka Wirebundler / Wirejumble). Great for making bundles of wires, tentacles or tree branches. Hope you find it helpful!https://www.youtube.com/watch?v=pciLlHxf0Ww
Welcome to the 6th official Megastructure Book Project Update, a visual encyclopedia of scifi megastructures!So first off, the rough layout of the book is complete. More work still needs to be done, the text of the book is getting edited right now, and so some adjustment will need to be made once we have the final text. An image or two will need some finessing to fit the format. Some extra background graphics need to be added. But overall, all the bones of the layout are complete. On a technical note, most books are laid out in Adobe InDesign, but since Adobe software is now rental only, and I don't use rental only software, I got a copy of Affinity Publisher for a one time purchase price of $50, and found it just as powerful as InDesign ever was. Thanks Affinity for charging a fair one time price!Second big news is book proposals for the project have now been sent off to all the publishers I wanted to contact. Will spend the next several months in talks with various publishers to see if any of them are the right home for the project. If we find a good match, awesome. If not, the next step will be the Kickstarter to preorder, followed swiftly by printing and shipping. Will let you know how it goes.That's it for now, hope to provide more updates soon. I am super stoked to get the project finished and into your waiting hands. Megastructures started in 2014, and so in the context of the whole project, we're in the home stretch now!And thanks again for continuing to follow along!
Welcome to the 5th official Megastructure Book Project Update, a visual encyclopedia of scifi megastructures!I just received the final 2 images for the Megastructrure project from one of our guest artists. So that's it. All images for the book have been finished! That's a total of 46 paintings, 43 diagrams, and 7 bonus images! A big thanks to the guest artists for adding their own unique vision to the other artwork in the book. It's possible an image or two may be added or lost in the final layout process, but I'm declaring this 100% image complete, we have all we need to make this book happen!So the next step is completing the rough layout, which is likely to be finished in the next few weeks, and the completion of the book proposal which is being sent to a number of major publishers.Will post another update when the book is laid out, and then keep you updated on how the publisher search is going. If no suitable publisher is found, we'll kickstart the printing just like we did with the last book, it worked last time, so it'll work again. But I'm hopeful we'll find a good publisher who believes in the project and will allow us to reach a wider audience. Will let you know how it goes. And thanks again for continuing to follow along!
Yearly Wrap Up, Goodbye 2020, may we never see the likes of you again!
What a mess of a year! I'm sure I don't have to tell you what a crazy year this has been, we've all experienced it ourselves, and did our best to weather its uncertainty, it's troubles, and even a few of you likely had to battle the virus in person. But it's time for the yearly round up of soulburn art, and 2020 isn't gonna stop me from doing it, so here goes.In a week I will have been working at Monolith for 2 years as a staff concept artist. We were the lucky ones, the videogame industry is still going strong since so many customers had to stay indoors, many friends in the live action portion of the film industry had a much more difficult year (not to mention all of the front line workers, essential workers, etc), looking forward to everyone moving in the right direction again in 2021. As for our new game, I'm currently on concept piece #633, still awhile before we can show what we've been working on, but leave it to say I have a lot of new artwork to choose from when it comes time to share. Really proud of the entire Monolith team, this game is gonna be crazy cool! The main focus this year for me in my side hussle was the Megastructures book. We're about a week away from all the artwork for the book being completed (will do a separate post on that when it is). While a lot of work, really really happy with how the book is going so far, and I think you will be too. And thanks to the awesome guest artists who helped out!Then 2020 saw the release of V1 Interactive's "Disintegration" videogame, and the release of a bunch of concept art I made for the game. Did my 4th Inktober. Posted a whole bunch of new video tutorials to my youtube channel. No comicons this year for obvious reasons, hope to see that change next year.So for 2021, I'll be continuing to work on the Monolith game (and may even return to the office mid year after 1.5 years working from home), and the main work on the Megastructures book will be done, and we can focus on publishing, either with a major publisher or kickstarting a self published book.To everyone who has been following my work this year, thanks you so much for being a part of this journey. May the coming of the vaccine allow us all to focus again on connecting with each other in person. We are social creatures, and this year has made that very hard. So may next year be a renaissance of parties, concerts, and human connection. And may everyone who has been affected by 2020 (whether its sickness, losing a loved one, losing their job) get back on track in 2021!
Welcome to the 4th official Megastructure Book Project Update, a visual encyclopedia of scifi megastructures!I just passed a major personal milestone on the project, and completed my very last painting for the book! My 32nd painting is done, the Standford Torus, which I had left for last since it was going to be one of the most difficult.So all the art is done then? Not quite, but we're close. As I mentioned earlier, I have a team of guest artists who are contributing a number of pieces of artwork to the book. All of them are working hard to get those last images done. So we currently stand at 91% art complete, and we're still on track to get all the artwork done by Christmas. Over the next 2 months I will polish the text in the book while I wait for those final images, and then the second rough layout starts in January.So more progress is being made! Thanks again for following along, next update should be to announce all the artwork is done.
Just posted this text / video lesson called "Compositional Weight", discussing how to balance your paintings visually using contrast to draw the eye. Hope you find it helpful! http://www.neilblevins.com/art_lessons/compositional_weight/compositional_weight.htm
A tangent simply is an area where two things in an image are nearly touching or actually touching, and in doing so create a visual mistake that the human eye doesn't like. It can lead to confusion in the viewer, like it can draw the eye to a part of the painting that's unimportant, it can just feel wrong or strange, or it can confuse the viewer as to the relative depth of the two objects. This lesson explains in more detail what a tangent is, and the best ways to fix them in your own artwork.http://www.neilblevins.com/art_lessons/tangents/tangents.htm
A Concept Brief is a document that a client / art director makes for the concept artist to explain what they'd like them to design. It includes information like what the object / character is, what format the art should take, reference images, etc. Watch or read this short lesson on how to write a good concept brief, or if you're an artist, what things you should expect from one.http://www.neilblevins.com/art_lessons/writing_a_concept_brief/writing_a_concept_brief.htm
Added a Photoshop Brush pack of Tree Silhouettes to artstation and Gumroad, great for far away trees and super closeup trees that don't need any interior details.http://www.neilblevins.com/art_assets/brushes/Soulburn_BrushAssetPack_TreeSilhouettes_1.htmAnd here's a painting I did many years ago with these brushes, along with a tutorial discussing how to make your own tree brusheshttp://www.neilblevins.com/art_lessons/tree_brush/tree_brush.htm
Mark One Dev Log #3 Overview of the Animation System
Welcome to the third dev log for my game! In this post I’ll provide a brief overview of the animation system running for each character.Before I dive into more details about it in case you want to support my project you can wishlist on Steam using the following link: Need of...... Read MoreThe post Mark One Dev Log #3 Overview of the Animation System first appeared on Orfeas Eleftheriou.
Welcome to the Mark One Dev Log #2! In this post I’m going to about the logic that drives ranged AI enemies in the game. Before I dive into more details about it in case you want to support my project you can wishlist on Steam using the following link:Rules of...... Read MoreThe post Mark One Dev Log #2 The Mind of Ranged Enemies first appeared on Orfeas Eleftheriou.
Mark One is a game that I have been working on for the last few months. It’s a top down Sci-Fi shooter game, heavily influenced by Doom, with twin stick shooter elements. In this post I’m going to talk about the tutorial level of the game and how important it’s for the players. But before...... Read MoreThe post Mark One Dev Log #1 first appeared on Orfeas Eleftheriou.
In this post I’m going to show you how you can profile your code blocks in order to identify potential issues and make your game run smoother!In order expose your code’s performance in the Unreal Engine profiler tool you really need two things:A stat group, which is the “category” of the code you’re profiling in...... Read MoreThe post Profiling Code Blocks first appeared on Orfeas Eleftheriou.
In this post I’m going to show you how you can consume .lib and .dll files in your ue4 project. In case you don’t have a library or a dll file hanging around I’m going to show you how to create one and provide some example files that I have created for this post.This post...... Read MoreThe post Consuming lib and dll files first appeared on Orfeas Eleftheriou.
In this post I m going to show you how you can debug packaged builds with visual studio.The process requires the following steps:Adding a break point somewhere in your codeIncluding debug files for packagingPackaging your project with either Debug or Development Build ConfigurationAttaching the visual studio debugger to your game’s processSince each project is different I’m...... Read MoreThe post Debugging Packaged Builds first appeared on Orfeas Eleftheriou.
In this post I m going to show you how to create procedural textures inside the editor. While in the following example we will generate a 1024 x 1024 texture using random RGB values, the same approach can be used with different generation algorithms to achieve your goals. This post was written using 4.22 version of...... Read MoreThe post Generating Procedural Textures first appeared on Orfeas Eleftheriou.
In this post I’m going to show you how to create your own custom AI senses. For demonstration purposes I’m going to create a new AI sense named “Aquaphobia” which will be responsible for identifying nearby water resources and register stimulus for each wate resouce. Then, the controller of the AI pawn will move the...... Read MoreThe post Creating custom AI senses first appeared on Orfeas Eleftheriou.
All Velocity Modules in UE4 Niagara Explained CGHOW 18 Mar 21 All Velocity Modules in UE4 Niagara Explained | CGHOWChannel Ashif http://bit.ly/3aYaniwSupport me on https://www.patreon.com/AshifSupport me on paypal.me/9953280644All Velocity Modules in UE4 Niagara Explained #cghow #RealtimeVFX #UE4Niagara #gamefx #ue4vfx #ue4fx #niagara #unrealengineniagara... Est. reading time: 7 minutes
Those look stunning! I m using Unity to develop right now and wanted to give Houdini a shot, mixing both softwares together to create stunning VFX like in the video but dropped Houdini pretty quickly. I m very interested in learning more advanced stuff. How useful is this for me then?
Hi VFX people! Today, I wanted to share with you another effect I made this week for my project.With this one, I had this idea of a fire black hole that popped into my mind while making some simulation tests with FluidNinja. It was the first time I got involved with a fully homemade fire effect and I had some trouble working with colors and behavior. Here s a capture of the final effect withing Unreal Engine: I used the same workflow for this fire effect that the one I used before for the ice one. I plan to make a post soon to explain how those effects really work! Feel free to give any feedback, I m really willing to improve!And may the VFX be with you guys!
Made a tutorial about how to spawn particles in an animated character, in a skinned mesh in Unity. This technique can be quite useful to create some cool effects for a character or an animated object. Enjoy!
Heyo good people of the realtimeFX world!I just wanted to drop this here for your eyeballs!Sjors De Laat, Ge Lush and I worked quite hard to make this workshop happened. ( well, I didn t work as hard )So I m sharing it here as well. That is all. Any questions or feedback is welcome! Have a good day. REBELWAY 2.0 Stylized FX for Games | Pro VFX Course - RebelwayLearn how to create realtime game FX with this exclusive Unreal Engine course from Rebelway. It's time to create magic, portals, and more. ThumbThumb1260 816 73.1 KB
Diving into Color. Tips & Tricks I use for using color
Great point, color perception does change over time! I think even the color switch between pink for girls and blue for boys happened not a long time ago too, in the 20th century, so it really makes me think about what will happen in the future.Oh yes, the health bar really gets me conflicted, not only conflicted, but I even get a bit more philosophical about it. Why is it even red? Is it associated with the heart icons that represented how many lives a player has? Is it just because red is great at grabbing attention? Is it just somebody s favorite color? Hmm, some red healing spell that now I can think of is health leeching from other enemies, they are usually blood red with some black accents. Yeah, it is harder to find the right spot, I tend to stick to the classics and observe what others do, as I think I am not there yet to break the rules. If all the VFX artists united we could start a revolution and change the so-called rules, hehe!!The VFX readability over the backgrounds is one of the most common things that I struggle with, so you are not alone here. I mean ideally, you could have two types of effects: one for the bright background and one for the dark one (or just have some of the elements of the effect changed, for e.g. have different smoke) and switch them with scripting. Sometimes this is just way quicker and helps to keep you sane! If there is a constant day & night cycle Well That s on you buddy A lot depends on the lighting in the scene, sometimes you can solve problems with lit particles, something I tend to use on steam/smokes in the daylight scenes. But yet, this is still a mystery to be solved (a very painful one too).
Diving into Color. Tips & Tricks I use for using color
@ShannonBerke @simonschreibtAh you guys, you sure know how to make a girl hyperventilate!!Thanks, everyone for such positive comments, really made my day!
Face area to vertex color add-on, can we improve wireframe renders?
CGMatter showed in a video how to create a wireframe shader that, unlike the cycles wireframe node, does not look at the triangulated mesh but at the original faces.The problemHis node setup works with an uv-layer that contains a default uv mapping (one that you can get with uv unwrap -->reset and then basically calculating the distance to the edge of the uv square.This works but it is not quite the same as the cycles wire-frame node. As some commenters pointed out, the line thickness is a fraction of the uv square so will yield thicker lines for larger faces.The solution?We could correct this if we had the face area available, but this is not an attribute we can access directly using nodes.So i came up with a small add-on that stores the face area in a vertex color layer, scaled to the largest face, so the largest face will get an rgb value of (1.0, 1.0, 1.0)If we have such a vertex color available we can simply divide the width of the original wire-frame shader by the square root of this value (because the fraction of the uv coordinate actually scales with the length of a side). The result now looks more even for large and small quads.LimitationsFor meshes with fairly regular quads this is acceptable, however, for less regular quads this calculation might give unwanted variation, especially for thicker lines (note the quads on the edges of Suzanne's ears):For non-quads the solution doesn't work at all because some side will not align with the principal directions of the uv-map.So I don't think this is a solution at all, but making a face area available as vertex colors might have its use in other scenarios so it is available for download.DownloadThe add-on can be downloaded from my GitHub repository. After installation it will be available in the Paint menu when in vertex paint mode. Simply select it and the active vertex color layer will be filled with grey values that represent the face area.A .blend file with a node setup is available as well.The wireframe node group contains an extra input that if set to a value slightly larger than zero will apply an additional perspective so that lines won't shrink with distance from the camera.
I’m super excited to invite you to Unreal Bucket beta. The project my friends and I have been working for months is finally ready to test! Watch the video and sign up on our website – unrealbucket.com. We got some nice Unreal Engine assets for you too!
There’s no doubt 2020 was a very challenging year for many people. It forced us to adapt to new realities, changed the way we live. Starting a big project with no external funding in these uncertain times wasn’t the best idea at first. However, I couldn’t resist the „now or never” feeling. I teamed up […]
Lately, I've been doing some freelance work on the side for some of my UE4 marketplace customers who required custom modifications for the blueprint templates. And one of these projects has a heavy focus on data-driven design through the use of Structures, Data Tables, Data Assets, etc. Without going into specifics, the advantage of doing so is essentially to enable the customer to make changes to the project mostly through data without tinkering with blueprint logic.While working on this project, I recently had to make some changes to a Structure which was being used across a lot of blueprints. After doing some testing, everything seemed to be working fine, with no compiler errors or gameplay crashes. So I decided to wrap it up and package the project, but then the packaging process failed and threw a whole bunch of "Unknown structure" errors. And it had to do with the edited structure.Granted the errors show the blueprints that contain this structure, but disconnecting & reconnecting every single one of those "Break Struct" nodes would have been quite time consuming. However, thankfully I stumbled upon an option under the blueprints menu that made this process a whole lot smoother.Fixing the errors still does require going to each of the listed blueprints, but instead of going to each struct related node, you can essentially go to the File option in the top menu, and select Refresh All nodes.Compile & save, and then do the same for all of the required blueprints. And that's it. If you try packaging the project again, there should be no "Unknown structure" errors putting a halt to the process anymore.This simple solution helped me save a decent chunk of time that would have been otherwise spent doing the same repetitive task across several blueprints. So I figured I'll share this information here. Maybe, it'll save you some time in the future.
Tower Defense Starter Kit Tutorial: How to add new Tower Functions
A Quick IntroductionWhat are Tower Functions? Well, you can think of them as special abilities for the Towers that can be activated to provide additional help in your defense against the enemy AI waves. The toolkit already comes equipped with a couple of Tower Functions, namely the Overdrive & Repair. Both of these functions are handled through dedicated actor components that are added to the selected Tower when activated.The Overdrive function temporarily boosts the output of your towers, while the Repair function restores a damaged tower back to its original HP. And in this tutorial, I'll show you how to add your own custom Tower Functions into the mix.[Additional Notes: This tutorial is based on the latest v2.17 edition of Tower Defense Starter Kit, but should also work just fine on most prior v2.x editions of the toolkit.v2.18 introduced a new and more streamlined workflow for Tower Functions. More information on the new workflow will be shared on the blog soon.] Tutorial: Event Begin Play1. Add a new entry to the enum ETowerFunctions for our new Tower Function. I'm going to call it "PrintTest" since I'll be creating a simple ability that will keep printing a string at timed intervals for as long as the Tower Function is active.2. Next create a new blueprint Actor Component for our Tower Function. Open the BP, add a new float variable AbilityDuration to it, and set it to Instance Editable & Expose on Spawn through the variable details panel.Finally it's time to add the implementation for our Tower Function. Since my component is just going to print some strings, I've added a timer to do just that as shown below. Just replace it with your own logic and we're done with the component setup.3. Now open up the blueprint interface BPI_TowerFunctions, and add a new function to it: Add"TowerFunctionName"AbilityComponent. Next add the following three input parameters to this function: AbilityDuration (float), StatusBarFillColor (Linear Color), & UIImage (Texture 2D).4. We'll now have to implement this interface function in the BP_Tower_Parent blueprint. The implementation involves the following steps: destroy any existing tower function component, add our new tower function component & store a reference to it, and initialize the UI icon for the tower function. Here is an example workflow that you can follow:5. Now we need to let the game know about our new Tower Function. To do that, open the DT_TowerFunctions data table, and a new entry to it for the Tower Function. Set the various parameters as per your requirements, but the two most important ones here are FunctionType & FunctionComponentClass, both of which have to be set to the enum and component we created earlier.6. Finally, open up the BP_TowerManager blueprint, and first add a new element to the AvailableTowerFunctions array and set it to our new Tower Function enum.Then head over to the ActivateTowerFunctionForTower function, and call the interface function we created earlier, for the switch case flow associated with our new Tower Function as shown below: Tutorial: Event End PlayAnd that's it. When you select a Tower from the game next time, you should see an additional interactive icon for activating the new Tower Function, and clicking on it should run the logic added in the component earlier.Now that you've set up your own Tower Function, I just wanted to leave you with an optional step that will enable you to update the duration of the Tower Function left after it has been activated. You can call the UpdateAbilityStatusDisplay interface function on your Tower (from the component) to do just that. The Overdrive & Repair functions already use this function for that purpose since both are temporary abilities. So if you're interested, I'd suggest checking out their blueprint implementations to see how its done.So, with that we've come to the end of this tutorial. If you have any doubts regarding the workflow, or have run into some issues with the implementation, just let me know in the comments.
Unreal Engine Tutorial: Setting custom Game Modes at runtime using Blueprints
For the longest time, I believed that it was not possible to change Game Modes at runtime through blueprints. That it had to be specified in the editor either through Project Settings or the World settings for individual maps. Fortunately, I was wrong and there actually does seem to be a method for overriding the default Game Mode through blueprints.I was recently involved in a project that required setting up completely different types of gameplay based on whether the player chose to play as the attacker or the defender. While it was definitely achievable by creating separate maps for each game mode, it just seemed redundant given that the maps are identical in all other respects. It made a lot more sense to have two separate game modes with their own player controllers and HUD classes. And after some research along that line of thought, I realized that the Open Level node had more tricks up its sleeves than just opening a new level.The Open Level node as the name suggests, is what you call when you want to move to a different level. But if you expand the node, you'll come across a new input parameter named Options. This can be used to pass in additional string commands when you open the new level. I'm not sure what exactly that entails in terms of possibilities, but we can definitely use it to override the default Game Mode associated with the level. All you have to do is specify the asset reference path for your Game Mode as per the format shown below:?Game=GameModeReferencePath_CYou can get the asset path by right-clicking on your Game Mode asset in the Content Browser, and selecting the Copy Reference option. What we need is basically the part within the single quotation marks. For example, this is what I get upon using Copy Reference: Blueprint'/Game/TowerDefenseStarterKit/Blueprints/GameModes/BP_GameMode_TowerDefense.BP_GameMode_TowerDefense'So my reference path is: /Game/TowerDefenseStarterKit/Blueprints/GameModes/BP_GameMode_TowerDefense.BP_GameMode_TowerDefenseWith the path obtained, the final string command for this example scenario will be as shown below: ?Game=/Game/TowerDefenseStarterKit/Blueprints/GameModes/BP_GameMode_TowerDefense.BP_GameMode_TowerDefense_CNow all that's left is to type this in the Options parameter, and next time this Open Level node gets called, it will also make sure that our custom Game Mode overrides the default Game Mode. If that sounds confusing, here is an example of what it looks like in my project (Select node not required; the path can be directly entered in the Options parameter):And that's all there is to it. This was pretty much completely new information to me, and it saved me the trouble of going about some roundabout fashion to meet the design requirements. And now hopefully, this information will be of help to others as well.
Unreal Engine Tips: How to use Keyboard Inputs when Input Mode is set to UI Only
I had recently migrated some of the HUD workflow from Prototype Menu System to Tower Defense Starter Kit. While going through the HUD blueprints, I noticed that the Input Mode was being set to Game and UI when activating the Pause Menu. Since I didn't want the player to have the option to interact with the game outside of the menu in this state, I went ahead and changed it to UI Only, only to find out that the Escape button could no longer be used to quit the Pause Menu (in spite of it being set to Execute when Paused). It turns out that setting the Input Mode to UI Only also prevents us from interacting with the UI using keyboard.At first, it seemed like I'd have to revert it back to Set Input Mode Game & UI. But further inspection led to a workaround to tackle the issue: Handling the keyboard input directly through the widget itself instead of the player controller. The first step is to ensure that the widget can be focused. We can do this by setting its IsFocusable attribute to True.Next we need to make sure that the widget has keyboard focus, which can be easily achieved by calling the Set Keyboard Focus function in the Event Construct of the widget. This will enable the widget to receive keyboard inputs.With that taken care of, all that's left is to override the widgets' default keyboard input response. So we'll override its On Key Down function and add the required logic. For example, in my project, that involves unpausing the game as shown below: And that's all there is to it. You should now be able to interact with your widgets using the keyboard even when the Input Mode is set to UI Only. I'll also be adding these changes to the Prototype Menu System in the next update. So it should be available on GitHub soon.
Controlling Player Progression in FPS Tower Defense Toolkit
Hi, it's been quite a while since I've written anything at all about my products on the Unreal Engine Marketplace. Lately, I've been working on a lot of improvements to the older products, starting with the Tower Defense Starter Kit earlier this year and now the FPS Tower Defense Toolkit as well.Over the years, FPS Tower Defense Toolkit has received several additional features through updates to make sure that it provides maximum value right out of the box. But with the inclusion of all of those new features also came added complexity. And as a result, I've taken the liberty of having most of the recent updates focus on making the toolkit easier to use and customize based on the community feedback received over the years. Even with the project being almost fully commented across all blueprints, with a large foundational framework like FPS Tower Defense Toolkit, things can still get quite intimidating for newcomers. And it is my hope that these articles will help in giving a clearer understanding of the core design of the toolkit.So in this article, we will focus on the process of controlling player progression in your games, specifically from the standpoint of what towers are available to them in each level. You normally wouldn't want to overwhelm the players with every single tower in your game right from the get-go. More often than not, it would be better to slowly introduce new towers as they learn how to play the game. Plus, doing so also has the added benefit of keeping a steady stream of novelty from a gameplay standpoint since the player is being given new toys to play with overtime.Before delving into the actual workflow (which is just a one-step process), I'll just give you a brief primer on the towers of FPS Tower Defense Toolkit. The toolkit comes equipped with an assortment of towers like Machine Gun Tower, Laser Tower, Sniper Tower, and so on. But in addition to these standard varieties, you also have access to what's called a Tower Base, which serves the dual purpose of both providing a platform on which Towers can be built, as well as the means to block and control enemy paths. Finally, you also have a Trap class, which can be directly placed on the ground (without Tower Bases) causing all enemies walking over it to take damage. Out of all these towers, the Tower Base is the only entity that must be included in the player's arsenal since they facilitate the construction of towers.With that out of the way, I'm going to show you how to specify which towers are available to the player in each level. For this purpose, we need to have an instance (actor) of BP_TowerManager placed in the level. If you aren't familiar with the Tower Manager, you can check out this post to get a basic understanding of what it represents. Now if we select the Tower Manager from the level editor, we'll be presented with a list of parameters under the Config section in its details panel, but what we're interested in is the BaseTowerModels array.The BaseTowerModels parameter is an array of type ETowerModels (ETowerModels is an enumeration for keeping a list of all towers in your game) that determines what towers are available to the player in that particular map of your game. By default, you'll see that it contains a Tower Base, all unupgraded Towers models, and a Trap. With the exception of Tower Base which is mandatory (as a Tower construction platform), all other entries can be modified to suit your game's progression. For example, if you want the player to have access to only the Machine Gun Tower and the Laser Tower during the opening levels of your game, then you can do so by editing the array to have entries only corresponding to the Tower Base and the required Tower Models as shown below:The next time you start that those levels, you'll find the changes reflected in the Loadout Selection menu (if you have it turned on) or the In-Game HUD (if loadout menu is turned off) as can be seen in the following screenshots:And that's all there is to it. You should now be able to control the Towers available to the player for tackling each mission on an individual basis. I'll be covering more topics like these for my Marketplace products in the coming months. So feel free to reach out to me through the support email if you have any specific requests along that line. Hoping to get the next post up and running soon.
Building the UT Translocator using UE4 Blueprints - Part II
Alright, so we're back again to continue from where we left off with our tutorial on building a Translocator in Unreal Engine. In the first part of this tutorial series, I had gone over the design process behind creating the Destination Module for the Translocator. Now we're going to create the other half of the device: the Source Module.The Source Module has two fire modes: the Primary Fire mode which can be used to launch as well as retrieve back the Destination Module, while the Alt-Fire mode teleports the player to the Destination Module (and retrieves it in the process). So that's what we're going to implement in this second and final part of the Translocator tutorial. So without further ado, let's get started.First, let us create an actor blueprint for the Source Module and set up its default properties. Once you're created the blueprint, open it, click on the Add Component button and add a Static/Skeletal Mesh Component (choose one based on your weapon mesh type). Since the mesh is not going to be interacting with anything, we can set its CollisionEnabled parameter to No Collision. Next, add an Arrow Component and adjust its relative location until its right where you want to Destination Module to be spawned. Finally, let's add a new boolean variable and call this bTranslocatorFired.Now that we have all the necessary components & variables set up, let's move on to the Event Graph. If you've played the original Unreal Tournament, you would have noticed that the Destination Module of the Translocator also gets spawned you equip the device. So that's what we're going to do first. In the Event Begin Play, we'll spawn in the Destination Module and attach it to the Arrow Component added in the previous step.Before moving on to the primary and alt-fire logic, we will first add a separate event to retrieve the destination module & reset the Translocator to its default state. Let's call this event RetrieveDestinationModule. This event will be called as part of both the Primary as well as Alt-Fire logic. To start things off, we want to reset the Destination Module to its default state. But since we've already implemented the required logic for this in the first part of the tutorial, all we have to do now is call the Reset event in the Destination Module. Next, we set its actor rotation to match the player camera rotation in order to make sure that it aligns perfectly with the Source Module before snapping back in place. Finally, we wrap up the event by setting the variable bTranslocatorFired to false. Now let's start with the Primary Fire implementation. We'll create a new event PrimaryFireTriggerPressed for this. Since it works differently based on whether the Destination Module has been fired or not, we're going to first check that by passing our bTranslocatorFired variable through a Branch node. If it hasn't been fired already, we detach the Destination Module, call its Launch event and set the value of bTranslocatorFired to True. If on the other hand, it has already been fired, we call the RetrieveDestinationModule event created in the last step, thus priming the Translocator to be fired again.And so we arrive at the final piece of logic within the Translocator's Source Module: the Alt-Fire mode. Similar to the Primary Fire, we'll start with creating a new event AltFireTriggerPressed, followed by checking if the device has been fired already. If yes, then we'll Teleport the player pawn over to the location of the Destination Module, and follow it up with a call to the RetrieveDestinationModule event.At this point, you should have a working Translocator at your disposal. But I'll also briefly show you how your player character can interact with it. In the Event Begin Play of your player character, use the SpawnActorFromClass node to spawn an instance of the Translocator, store a local reference to it, and attach it to your character mesh component.Now it's just a matter of getting the Translocator reference and calling its Primary and Alt-Fire events for each of the associated input action events. Here's a video showcasing the final result of what it should look like (ignore the lights & antenna on the Destination Module, they're something I just added for show)And thus we finally come to the end of this tutorial. I'll be sharing the project files on GitHub soon. Will update the link over here once it's done. Apart from that, I'm also planning to write an article on using EQS to create 2D Line of Sight Visualizations in Unreal. So if that's the sort of thing you're interested in, keep an eye out for it in the upcoming week.
Building the UT Translocator using UE4 Blueprints - Part I
As mentioned in the previous post, the original Unreal Tournament holds a special place in my memories when it comes to gaming experiences. And not necessarily because of the competitive elements, but more so because it took me to an assortment of diverse & beautiful fictional settings, inspired not just by science fiction, but also fantasy and medieval history as well. It also had a kick ass soundtrack to boot. So after playing the game again recently, I wanted to try and recreate some of its gameplay mechanics in Unreal Engine. While my first project was focused on Creating a Jump Pad (inspired by the UT99 Jump Boots), this time, I figured why not go for something more iconic and decided to recreate the Translocator.At first glance, it would seem like all you'd have to do would be to just launch a projectile and then alt fire to teleport to its location. But upon closer inspection, you can find that there is a bit more to this device. For example, whenever the Translocator's Destination Module lands on the ground, it always ends up facing upwards by the time it comes to rest.In terms of actual gameplay features, there is also the option to Telefrag (which I won't be covering in this tutorial) your opponents by throwing the Destination Module to their location before teleporting to it. So while the teleportation aspect is what comes to mind first when you think of it, the presence of the more obscured features makes the Translocator somewhat unique as a weapon/gadget when compared to a generic teleporter. So anyways, without wasting any more time on the specifications of the device, let's get right down to the design process.Alright, so the first order of the business is to create an Actor blueprint for our projectile. I'm going to name it BP_DestinationModule. Now we're going to open the new blueprint and untick its Start with Tick Enabled parameter. We'll be turning it on and off manually at runtime, but more on that later.Now let's open up the new blueprint and add a Static Mesh Component to act as our projectile mesh. Since we don't want this mesh to collide with anything before we fire the Translocator, let's set its Collision Enabled parameter to No Collision by default.Normally you would add a Projectile Movement Component as the next step. However, unlike your typical bullet projectiles, the Translocator fires a pod that can be retrieved and fired again. And that brings us to a complication since the Projectile Movement Component cannot be restarted once the actor has totally come to a halt. You can try reactivating it or resetting the initial velocity, but that's not going to be of any help. Hence we're going for an alternative approach: adding/removing projectile movement components at runtime. Every time we launch the module, we'll add a component and once it comes to a stop or if it is retrieved back by the player, we'll destroy it.With that said, we'll start out by adding a new boolean variable bIsActive to our Destination Module blueprint. This variable will be used to update and keep track of the module's state.Next we're going to create a custom event for launching the Destination Module. As shown in the screenshot below, we first check if bIsActive is True. If the module is already active, we do not want to do anything. But if it isn't, we'll add a Projectile Movement Component (you can find its default settings in the screenshot) and bind a new event to the its On Projectile Stop event. Let's call this new event OnProjectileStopped. We'll get back to its implementation in the next step, but basically what's happening here is that this new event will be called as soon as the projectile motion is completed. Finally we're also going to turn on the collision and set the actor to tick. Alright, let's move on to the implementation for OnProjectileStopped event that we added in the previous step. First, we're going to destroy the Projectile Movement Component because we'll be spawning in a new one next time. The actor tick can also be disabled since it is no longer required during this phase. Since these two steps are going to be called again elsewhere we'll group them under a new event DisableMovement.With the launch system taken care of, we now move on to the process of resetting the Destination Module when the player retrieves it back. For this purpose, we add a new event Reset and it's also going to call the OnProjectileStopped event, followed by two additional nodes to disable the collision and reset the value of bIsActive variable.So at this point, we have essentially finished what I'd call the core logic of the Destination Module. But in the spirit of keeping it true to the inspiration, I'll also show you how to make sure that it always lands facing upwards. And this is where the Event Tick comes in to play.As shown above, we're using FInterp To nodes to interpolate the Roll & Pitch values of the Mesh rotation to 0.0 while keeping the Yaw as it is. This will ensure that the mesh always faces upwards. However, you might have noticed that there is a branch node placed at the start of the tick event. It's basically checking if the velocity (vector length squared) is below a certain threshold (which I've set to 625), thus causing the rotation updates to kick in only moments before the projectile comes to a rest.And with that, the Destination Module of our Translocator is ready for use. I had initially assumed that the entire tutorial would be shorter than this. But since this post has already gotten a bit too long, I will be sharing the remainder of the tutorial in a second part, which will cover the projectile firing and retrieval mechanisms for the Translocator. But if anyone's interested in what the final result looks like, you can find the video in my YouTube channel. So see you in the next part of the tutorial.
I was recently playing the original Unreal Tournament again and realized that the game holds up pretty well even now, couple of decades after its release. I admit that nostalgia might partly be the culprit, but as soon as the intro scene started playing, hearing that voice and background score, it felt like getting into a time capsule back into the early 2000s. Booting up some levels, I was genuinely surprised by how atmospheric most of them felt.The diverse background settings in which the matches took place along with the excellent soundtrack that accompanies them, seemed to create a sense of being in that virtual space that you don't often get in a lot of games. And one of those levels that has really stuck with me is the Capture the Flag map "CTF-LavaGiant", with its beautiful skyscape and the sea of lava surrounding the core game space. After loading it up for a quick game, I came across a pair of Jump Boots on the way back from the enemy base, which I then used to vault over the walls of their fort.It was quite fun jumping over the lava to make my way back to our base. And then I figured why not try to make something similar in Unreal Engine. It's quite simple, but I had never tried any experiments with jumping mechanics. Besides, I rarely find challenge to be a deciding factor when it comes to motivation. It's almost always about the joy of just exploring new possibilities.So the next day, after work, I went ahead and created a new UE4 project. Since I wanted to create a wall jumping mechanic as well (inspired by ULTRAKILL), I decided to go for a Jump Pad instead of Jump Boots for this project.The basic set up was simple. Create an actor that has a collision volume, which when triggered by the player character, will push it in the upward direction. The Blink Ability project that I had worked on a year or two back, involved some logic that pushed the character over the wall to roughly simulate climbing after teleporting to a ledge above you.And while I ended up using the "Move Component To" node in favor of the Launch Character node back then, here the latter seemed perfect. As soon as the Jump Pad collision box registers an overlap, it checks if the overlapping actor is a character type and if true, launches it in the upward direction.However, after testing the Jump Pad a few times, I noticed that the jump heights weren't always the same. When the character walked up to the Jump Pad, I was able to jump much higher than when it was falling on to the Jump Pad. The issue was fixed by setting the ZOverride parameter on the node to True.Basically what this does is ignore the Z component of the character's velocity and assign the launch velocity value to it. Now even if your character is falling down to the Jump Pad, the negative value of the fall velocity Z component is not taken into account when launching the character upwards. So I guess that is something new that I've learnt as part of this experiment. Anyways with taken care of, my character was jumping around the map quite fine. And that's another experiment successfully completed. I also went ahead and recreated the Translocator from Unreal Tournament after this project. I want to try out the Shock Rifle at some point as well. But one thing at a time. Right now, I'm just glad to have started writing again after more than a year of hiatus. Hoping to keep this going.
I've been playing a fair bit of stealth & tactical action games of late and noticed that most of them have some form of enemy tagging systems to help the players form a better tactical awareness about the game world. I generally note down interesting gameplay systems that I come across, in order to study them in detail at a later time. But since this particular mechanic didn't seem like it would take up much time, I decided to jump into the process right away.I started doing some research on the concept and various approaches taken by different games to implement it. And I took a particular liking to the Metal Gear Solid V's take on tagging systems, with its added support for range display as well as the highlighting of occluded objects. So I decided to go ahead and recreate it in Unreal Engine, and this post is basically a high-level retrospective overview of the implementation process. But before getting into the details, here is a super quick preview of what the end result is going to look like:Alright, so without further ado, let's dive into the design process behind the experiment.The first order of business here is to create a custom widget that can display the tag image as well as the distance to the player character. Since the tag needs to hover above the target at all times, we can use a widget component and leverage its inbuilt functionality to attach itself to actors. However, since highlighting of occluded targets is also part of the agenda here, it makes sense to render the widget on screen space.However, if we try this out in the editor, we'll notice an issue once we start moving away from the tagged actors. The widget will start covering most of the actor until at very large distances, the actor becomes barely visible at all. This happens because the relative distance from the actor to the widget component remains the same, while the widget being rendered in screen space retains its default size. And that is not desirable. Now there are two ways to resolve this. The first involves changing the widget size dynamically, while the other approach revolves around updating the relative location of the component at runtime. After taking another look at the workings of tagging systems from a few real games, I noticed that the second approach is generally favored, and that's the one that we're going to take here. While I tried out multiple types of alignment correction models, it finally came down to just using a simple linear multiplier based on the distance involved.Here is a comparison of the tagging system with and without the distance-based alignment corrections (I also threw in some script to display the distance text): Now that we have a working tag widget, we can throw in some player input driven logic for adding tags to actors manually. A simple line trace-driven check would suffice in this regard. We can take the hit result data and request activation of tag display if the hit actor has a tag widget component.And that brings us to the final section: implementation of occluded object highlights. To this end, we can use a post-process material with custom stencils enabled. I'm not particularly good with materials, but fortunately, Rodrigo Villani has already created an awesome tutorial on how to create outlines in Unreal Engine. I took the basic material setup explained in the tutorial and threw in some additional script to add translucent filling within the outline area. And that's about it. Here is a preview video of the enemy tagging system in action: With that, we have come to the end of another experiment. Eventually, I hope to use this space to write about all of my experiments, but it is probably going to be a while, given my trajectory so far. But I generally keep my Youtube channel updated with the latest projects. So if you like to see more cool experiments in Unreal Engine, you know where to find them.
Unreal Engine Experiments: Last Known Position Visualization
The blog has been dark for a while now. But the past few months have been a quite fun experience as I got to experiment with a whole host of interesting gameplay systems in Unreal Engine. And I have to admit that the prospect of writing about them is not nearly as exciting as working on them. But I have finally summoned the willpower to get one article published over this weekend. So I figured that I'll go ahead and write about the most exciting project that I've worked on (since the recreation of Blink ability from Dishonored): the Last Known Position mechanic from Splinter Cell Conviction. As the title suggests, we're going to cover the process of visualizing the player character's last known position (as perceived by the AI). The mechanic itself should be quite familiar to those who have played either of the last two entries in the Splinter Cell franchise. But in case you're not, here is a short animated preview of what exactly the end product is going to look like: Alright, so with that out of the way, let's get into the nitty-gritty of the experiment. Basically, there are three main steps required for implementing the visualization system: Create a translucent silhouette materialSetup an animation pose capture & mirror systemImplement a basic AI perception system for tracking purposesNow let's go over each of them in order, starting with the material creation process.Silhouette MaterialI started out with this because I had absolutely no clue how to get this working. So if anything was going to be a showstopper, it was probably going to be this one. I mean you can't just throw in a basic translucent material and call it a day. The material script also needs to be able to cull the inner triangles of the mesh. Being a complete noob at materials, I turned to the internet for help. Thankfully, Tom Looman had already posted a custom depth based solution in his blog and it involves the use of two similar overlapping meshes: a translucent mesh rendered in the main pass, and an opaque one rendered in custom depth. Here is a preview of what the final result: Well, with that taken care of, let's head over to the next step in the process.Visualization Pose CaptureI'm not very familiar with the animation side of UE4, but this part of the process actually had a relatively more straightforward solution. While the first idea that came to my mind was to copy the player character's animation poses over to a new skeletal mesh component, I wasn't particularly keen on going down that route. The reason being that there was no real need for a full-fledged animation system for our visualization mesh. We just need to set a pose once and then forget about it. Fortunately, after doing some research, I stumbled upon this neat little thing called Poseable Mesh component.The Poseable Mesh component was exactly what was required for this scenario. It was intended to be used for one and only one thing. To mirror a single pose from another skeletal mesh. No unnecessary features involved. And it comes with an inbuilt function that lets you do that by passing in a reference to the target skeletal mesh component. Just copy the target's transform coordinates as well and we're done.And now on to the final part of the experiment.AI PerceptionI went ahead with Unreal's inbuilt AI Perception system for this one. I'm not going over the details here as there are quite a few good resources available within the community already. But the basic gist is that I'm using it to keep track of AI agents gaining/losing track of the player character.With this information, we just plop down our visualization actor every time the player evades the AI. And there you have it: a recreation of the Last Known Position mechanic from Splinter Cell Conviction. Here is a video preview of the system in action:With that, we have come to the end of another experiment. I've shared the project files on GitHub. So feel free to use it in your work. Also head over to my YouTube channel if you're interested in checking out more cool experiments in Unreal Engine. Alright, so that's it. I hope to publish the next post sometime during the next weekend. Until then, goodbye.
A few weeks ago, I came across an article on Gamasutra about the various types of UI systems used in video games. I was never particularly interested in UI design, but this article piqued my interest in the subject. So I started reading up more on the subject matter and played through a few games like Dead Space and Tom Clancy's Splinter Cell: Conviction, both of which were lauded for their innovations in the UI design space. Even with the games being almost a decade old at this point, the UI systems employed by these games are starkly different when compared to most of their contemporaries. Anyways, playing through Splinter Cell: Conviction got me really interested in the concept of Spatial UI design. Basically, this form of design represents UI elements that are displayed within the game world but are not actually a part of the world/setting. After doing some research on various types of systems that come under this category, I decided to recreate some of these UI components in Unreal Engine. To that end, I started work on a couple of projects, the first one being the Waypoint Generator. Now, I had previously developed a couple of functional waypoint generation systems as part of my Tower Defense toolkits. So instead of starting the project from scratch, I just migrated the required blueprints over to a new project and started working from there.The basic underlying logic revolves around the use of nav mesh to obtain path points from the player character towards the active objective. The path thus obtained is then divided up into smaller segments before adding them to a spline component. The generation of these additional path points serves the purpose of removing weird twisting spline artifacts that occur around sharp corners when dealing with a very limited set of spline points. With that potential problem taken care of, all that's left is to lay down instanced static meshes to display waypoints along the path. Moving on to design structure of the implementation, it's using a child actor component to attach the waypoint generator to the player. Within the construction script of the generator there's also an option to try out the system in the editor for debugging purposes as shown below: The system, however, does have a limitation when it comes to displaying waypoints along certain types of inclined surfaces. Basically, from what I've heard, the navigation system in Unreal Engine tries to reduce redundancy as much as possible while generating path points. This can sometimes lead to a situation where a line drawn from one path point to the next ends up passing under the surface or quite a bit above it when dealing with stairs and other steeply inclined surfaces. Splitting up the path into smaller segments as I've mentioned earlier will not help in this scenario because it doesn't really take the navigational paths into account. It's basically just dividing a line without any other concern. But in any case, I've added a system that can mitigate this issue to some extent by using line traces to check the ground location at all points before placing down the waypoint meshes. It may not be able to correct the rotational data between path points in certain scenarios, but it always makes sure that the meshes are placed just above the ground location. If anyone knows of a better way to get around this issue using blueprints, I would really like to hear about it. So feel free to share it in the comments section.
Unreal Engine Experiments: Prototype Menu System v2.0 Update
About three years ago, I had created a menu system with the intent of having UI elements that could be easily tacked on to all of my projects. The project was released for free on GitHub and had received a slew of updates for a while. But after shifting my focus over to creating content for the Unreal Engine Marketplace, I found myself having very little breathing area for working on side projects. And eventually, work on the menu system was abandoned, though it was still available for public use in its Unreal Engine v4.9 iteration. However, lately, I've been investing more of my spare time on some fun little side projects and to be honest, finding it quite enjoyable and refreshing. So after my recent foray into recreating the Blink ability from Dishonored, I found myself thinking about bringing the project back online and actually seeing it through to completion. Loading up the project again in the latest version of Unreal Engine, I was surprised to find that it was quite compatible with the new version. But as I went through the code, it became glaringly obvious that most of it would have to be completely revamped. The menu system was working quite alright, but three years is a long time, and I had originally worked on it just a few months after I first started using Unreal Engine. And going through the project again, the code spoke for itself as to how cringeworthy some of the workflows were. As a result, most of the time spent working on this new update was focused on improving upon the existing codebase. In any case, the work is done and since I absolutely suck at making video demonstrations, I'll just briefly go over the various menu screens available in the v2.0 edition. Main Menu The main menu allows you to either start a new game, go to the options menu, or quit the game. Options Menu While the options menu has four different sub-options available, only the display and graphics options are functional in the current state. Display Options Menu Players can control the screen resolution and window mode settings through this menu. Graphics Options Menu As shown in the screenshot, the graphics options menu allows you to control the following settings:AA QualityFoliage QualityPost-Processing QualityTexture QualityShadow QualityView Distance QualityEffects QualityVSync Loading Screen It's basically a screenshot that gets displayed for a specified period of time. A throbber is placed to indicate that the level is being loaded. Pause Menu The pause menu provides the options to either resume the game, exit to the main menu or to quit directly to the desktop. Well, that covers all the major features of the Prototype Menu System in its current state. I'm planning to introduce more features over future updates in order to make it a more robust and complete system. But for now, you can grab the source code from GitHub at: https://github.com/RohitKotiveetil/UnrealEngine--PrototypeMenuSystem
Around a couple of months ago, I finally managed to finish Dishonored. I had tried playing it a couple of times in the past but got turned off both times by the starting section of the game, which I still think is one of the weakest parts of the game. Even though it was very clearly trying to make the player develop an emotional attachment to one of the primary characters, it felt more like a chore to me. The protagonist was obviously close to the said character, but none of that resonated with me as a player who was completely new to this world. I was more interested in exploring the world, with its huge whale hunting ships and a new and original setting, but you have to go through a linear and somewhat uninteresting gameplay section. Frankly, I'd rather have the game take me sooner to the scripted story sequences before moving on to the first real mission. But leaving that aside, after having played the game through to completion, I can definitely say that I thoroughly enjoyed the rest of the game once the world opened up and provided opportunities to explore and study its various intricacies. However, what really made the game stand out for me was its Blink ability and the game does not wait long to present it to the player.Once you get access to the ability, a whole new array of gameplay possibilities become open to you. It's essentially a single gameplay mechanic tailored towards multiple types of gameplay styles. You can become an explorer, navigating the tallest buildings to the deepest alleyways with the sort of freedom of movement not usually allowed in games (when you factor out the crawling through vents design). Or you can choose to play like a ninja, appearing suddenly from the shadows to strike his opponent, only to disappear again in an instant. If you prefer a more aggressive playstyle, the Blink also provides the player with a tool to quickly close the distance to opponents before plunging a blade into their throats. To be honest, it is the closest I've come to feel like an anime character in a first-person game, moving swiftly across the battlefield, taking down his opponents with finesse. And as is usually the case when I get excited about something like this, I had to learn how it works and recreate it on my own. Fortunately, it didn't just end up as another entry in the backlog of cool experiments to try out, and I actually got around to working on it.I first started out with studying the ability to look for any hints of design that could be visible under close inspection. First and quite easy to notice was that it wasn't a teleport ability, The player character was being moved to the targeted destination, while visual effects play out on the screen. With my extremely limited knowledge of materials and VFX, the only effect obvious to me was the field of view modifications. And that would have to do. The main goal is understanding the workings of its physical movement system. Upon further inspection, I came across a more obscured design choice. The game was not using a line trace for the targeting system. This can be easily noticeable when using the ability near waist-high walls. If the aiming direction is only slightly above the wall, it will display the target location right in front of the wall. So it seems that a sphere trace (or some other simple 3D shape) is being used to ensure uninterrupted movement to the destination.So with a basic idea of how things might be working under the hood, I began work on the implementation. The first task was to just move the player towards where the camera was being aimed at. The built-in 'Move Component To' function took care of this requirement. I added a couple of timelines to change and revert the field of view values during the process. Already by this point, my character was easily darting around the map using the ability.Next up on the itinerary was the targeting system. Again my intention here was not to spend time on making effects that looked exactly like its original inspiration. Instead, a basic cylinder mesh having a gradient material with its transparency increasing along the +z direction would do just fine. Again lack of experience working on materials became an issue here. Fortunately, after scrounging through a few pages on the net, I came across a solution that did exactly what was required. With the gradient material setup, I just needed to move the target display actor based on the results of a sphere trace fired at regular intervals. Now, all that's left was the wall scaling system. I already had a placeholder system that used the 'Launch Character' function to propel the character up the wall when necessary. However, it was too slow and felt out of sync when used in conjunction with the swift Blink movement. And I wasn't really sure how to get it right. Another potential approach would have been to use linear interpolation along a parabolic curve to the top of the wall. I wasn't particularly fond of the idea and was hoping it wouldn't come to that. Fortunately, I tried out the 'Move Component To' node again in this scenario and it actually worked out quite well. Next, I added a check to see if the obstacles encountered by the targeting system fall into the category of 'walls'. If yes, then it was followed it up with a line trace to determine the distance to the top of the wall as well as to confirm that the wall meets the minimum depth/thickness requirement. If both cases meet the requirements, a further sphere trace is performed from a calculated point just above the top surface of the wall, in the upward (+z) direction to ensure the availability of free space for the player character to stand upright. If this condition is satisfied as well, a direction pointer gets displayed to convey that the character will automatically scale the wall along the said direction after the Blink movement. With the wall scaling mechanism already in place as mentioned earlier, the ability was finally working as intended to the fullest extent.With all of the required features working in tandem, all that was left was to clean up the code. A new custom actor component was created to house the Blink execution logic. This freed up the player character to handle only the input controls and a simple interface function to control the field of view. The use of this component driven design should allow the ability to be linked to new player characters quite easily.In the end, I must say that it felt really good to work on something that can pretty much be classified as finished. It's a huge contrast to my normal work on the toolkits, which require a lot of updates into the future. So I'm excited to keep working on more of these small offshoot projects. Anyways, the source code (blueprints) for the project has been published on GitHub. So you know, feel free to check it out at: https://github.com/RohitKotiveetil/UnrealEngine--BlinkAbility
FPS Tower Defense Toolkit Tutorial: How to create a new level
1. First, ensure that the default Game Mode & Game Instance class parameters in the Project Settings are set to BP_GameMode_TowerDefense & BP_GameInstance classes respectively. 2. Now create a new map, open it, & lay down the floor meshes. Add a Nav Mesh Bounds Volume & extend it to encapsulate all the floor meshes. This will enable AI bots to traverse across the level. 3. Add a Lightmass Importance Volume around the core game space.4. Now add a BP_PowerCore actor to the level in order to provide the primary target for the enemy AI bots. 5. The next step is to add instances of BP_EnemySpawnPoint to act as spawning volumes for the enemy waves. As can be seen in the next screenshot, I've added a couple of spawn points in my level: Now select each of the spawn points, and set their Target Power Core parameters from the dropdown list shown above. This parameter is used to assign the Power Core that will be targeted by enemies spawned at the said spawn point. 6. The toolkit uses modular grid generators to act as platforms for tower placement. But before adding them to the scene, first, add a BP_GridManager actor. The Grid Manager determines the size of individual grid cells across all grid generators in the level through it's publicly exposed variable 'GridCellSize'. 7. With the grid cell size specified, it's now time to add the grid generators. Drag & drop instances of BP_PlanarGridGenerator into the level. The overall size of the generator in the X & Y (local space) directions can be specified through the 'GridCountX' & 'GridCountY' variables as shown below:8. The next step is to add an instance of BP_EnemyAIManager. It will tackle the responsibility of keeping track of potential targets for the enemy AI bots. While the target acquisition and response logic are handled directly by the bots themselves, the Enemy AI Manager aims to provide them with all the necessary information in that regard.9. Now comes two of the most important classes required for the functioning of the toolkit: the Tower Manager and Wave Spawn Controller. Add the BP_TowerManager actor first. There's no need to make any changes to its default parameters, but if you're interested in knowing more about the class, check out the concept overview post for the same at: https://forums.unrealengine.com/unreal-engine/marketplace/50583-fps-tower-defense-toolkit?p=647339#post647339Next, add an instance of either BP_BatchedWaveSpawnController or BP_WeightedWaveSpawnController to the level. While most of the default parameters will work right out of the box without any need for customizations, you will have to add references to the enemy spawn points within the publicly exposed array 'BatchedWaveSpawnDataArray', if you're going for the former model. Just make sure to specify the 'SpawnPoint' parameter for each wave by linking it to one of the spawn points added to the level in Step 4. For further reading about the wave spawning systems, see:FPS Tower Defense Toolkit Basics: Batched Wave Spawn ControllerFPS Tower Defense Toolkit Basics: Weighted Wave Spawn Controller10. Now onwards to the final step. All that's left to do is to add level bounds so that actors like projectiles that can move across the level gets destroyed once they cross a certain threshold. Add six BP_LevelBounds actors along the outer periphery of the level, making sure that they form a cuboid shape to encapsulate the entire level.With that, you should have a fully functional level at your disposal. If you have any queries regarding the workflow, feel free to let me know in the comments section.If you're interested in the toolkit, you can purchase it through the Unreal Engine Marketplace: https://www.unrealengine.com/marketplace/fps-tower-defense-toolkit
The FPS Tower Defense Toolkit comes equipped with two different types of wave spawning models: one centered around user-defined wave patterns, and another geared towards generating waves based on weighted probability distributions. Today I'm going over the design of the second model: the Weighted Wave Spawning System.The WeightedWaveSpawnController is intended to provide the designers with a tool capable of generating waves of AI bots in an automated manner. Unlike its contemporary, which requires the designer to explicitly specify each and every aspect of the wave, we have a system here that spawns units based on a weighted probability distribution. This approach essentially allows us to control the spawn probabilities of different AI classes based on the weights associated with them.Another factor that helps with the automation process is the threat rating attribute, which keeps increasing with each subsequent wave. As the threat rating goes up, it opens the door for spawning high tier enemies. Meanwhile, the number of low tier enemies may go up as well. Both of these factors together increase the overall difficulty of the game over time.To aid in the functioning of these systems are a host of variables that can be customized directly from the editor. I will go through each of those attributes one by one and briefly explain their individual functionalities:1. WeightedWaveSpawnData: Controls the shifting dynamics of waves through the following parameters:NumberOfWaves: Determines the total number of waves.TowerBaseResourceAllocation_BaseValue: The number of towers bases to be distributed to the player per wave.TowerBaseResourceAllocation_StartingBonus: Bonus tower bases to be provided to the player at the start of a level.TowerPointResourceAllocation_BaseValue: Tower points to be distributed to the player prior to the first wave.TowerPointAllocation_LinearGrowthMultiplier: The number of tower points to be distributed to the player keeps changing based on a linear function. This parameter controls the growth rate of tower point allocation with each new wave.WaveThreatRating_BaseValue: Threat Rating of the first wave. Determines the starting difficulty of the level.WaveThreatRating_LinearGrowthMultiplier: The wave threat rating increases with each new wave based on a linear function. This parameter controls its growth rate and thus provides an estimate of how fast the difficulty rises up as the player progresses through the level.SpawnInterval_MinValue: Minimum duration between spawning of individual bots.SpawnInterval_MaxValue: Maximum duration between spawning of individual bots.MaxUnitToWaveThreatRatio: Restricts the types of units that can be spawned in a wave based on the ratio of their threat rating to the wave threat rating. Only AI classes with these ratios less than or equal to this parameter will be spawned. As a result, this feature can be used to prevent high tier units from being spawned during the initial waves.2. AISpawnWeightingDataArray: Enables setting of the spawn weights, with each element of the array used to represent the spawn probability data about an individual AI class. Each element of the array contains the following attributes:AIType: An enum that defines the AI model.BaseWeighting: Controls the spawn weight of this AI class. Higher values relative to other classes increase the spawn chances, while lower values reduce it.CanBeSpawned?: Determines if this AI type can be spawned. Can be used to control/restrict the types of enemies present in different levels.DataTableRowName: Connects the AI class with the associated row in the DT_EnemyAIStats data table. [No need to explicitly specify this as it will be set automatically by the wave spawn controller]3. NumberOfWaveCycles: Not applicable to this model. Used by Batched Wave Spawn Controllers to repeat wave patterns.4. TimedWaveStarts?: Determines if new waves will start automatically after a designated amount of time. If turned off, the player will have to manually trigger new waves.5. WaveTimerInterval: If TimedWaveStarts? is set to true, then this parameter controls the time interval between waves.
Q: I want to change the size of the vision arcs. Where can I find the variables that control it? Is there a way to do it from the editor window? A: You can customize the radius & angle of vision arcs for all types of AI through their perception components. The toolkit uses a custom AI Perception system (BPC_AIPerception) that enables the AI bots to perceive different types of stimuli. It basically allows for four different types of perception (for further details, check out Top Down Stealth Toolkit Basics: AI Perception), out of which the Visual Perception system is responsible for hosting the parameters that determine how far & wide a bot can see. And these include the 'VisionRange' & 'HalfVisionAngle', which among other things, also control the size of the Vision Arcs. Ideally, these variables could be made public (instance editable) and thus enable customization directly through the editor. However, due to an engine bug (https://issues.unrealengine.com/issue/UE-46795) that automatically resets public struct values stored in components, I had to revert it back to being editable only from the parent blueprint to which the component is attached. So this means that the aforementioned attributes will require editing through the details panel of perception components in AI blueprints (as shown in the screenshot below). Doing so will instantly apply the changes to all actors of that particular AI class. I understand that this is a bit tedious compared to directly editing these attributes from the editor details panel, but Epic has marked the bug as fixed for the v4.19 release. So if the fix does make it into the final release, it should then be possible to set the variable to public and directly test out changes through the editor itself.
Q: I noticed that the turrets are disabled when I start a new game. But then they sometimes get activated over the course of a game. Why is it behaving this way, and how can it be enabled right at the start of a mission?A: The turret AI in Top Down Stealth Toolkit is set to a deactivated state by default. This is an intended feature designed to showcase the use of automated security devices as a form of backup system for the AI. The default behavior is to activate them once the Global Alert Level escalates to Stage I, which is why they seem to get turned on sometimes during the mission.However, this design is not set in stone, and it can easily be modified to have the turrets turned on at the start of a level. If you want all turrets to be activated by default, open up the 'BP_AutomatedSurveillance_Turret' blueprint and set the 'UseDelayedInitializationModel' variable to False. Basically, this variable determines if an AI agent gets enabled by default, or on a need basis over the course of a mission. On the other hand, if you want only certain turrets placed in the level to be turned on, then just select those actors in the editor and set the aforementioned variable (check the screenshot below) to False through their details panels.
Top Down Stealth Toolkit Tutorial: How to create a new level
1. First, ensure that the default Game Mode & Game Instance class parameters in the Project Settings are set to 'BP_GameMode' & 'BP_GameInstance' classes respectively. 2. Now create a new map, open it, & lay down the floor meshes. Add a Nav Mesh Bounds Volume & extend it to encapsulate all the floor meshes. This will ensure that the AI agents/bots, once added will become capable of traversing across the level. 3. Add a Lightmass Importance Volume around the core game space. 4. Now drag & drop the following blueprints as actors into the level: BP_AISensoryManager, BP_AISurveillanceController, BP_GlobalAlertLevelController, BP_PatrolGuardSpawnPoint (multiple, if necessary), & BP_ExitPoint. Before moving on to the next step, here is a brief overview on what each of these actors bring to the toolkit:The AI Sensory Manager continuously evaluates all stimuli against various agents & dynamically assigns new objectives to the AI agents based on the results. It basically is kind of like a task manager for the AI, functioning at a level higher than each of the individual agents.The AI Surveillance Controller directs the activation of all AI agents within the level. This system can be leveraged to create different starting situations for each level, choosing to activate all security measures by default or have them activated dynamically based on the overall threat perceived by the AI.The Global Alert Level Controller uses event dispatchers to continuously listen in on new stimuli being perceived by AI agents across the level, & updates the Global Alert Meter based on the threat rating of the perceived stimulus. This meter enables the aforementioned AI Surveillance Controller to dynamically increase the AI presence in the level in order to counter the threat posed by the player.The Patrol Guard Spawn Point as the name suggests, act as spawn points to bring in additional Patrol Guards as back up. Unlike the other four actors on this list, these spawn points can be added in multiple spots across the level.The Exit Point essentially serves as a sort of final objective marker for the level & gets activated once the player collects all the gems placed in the level.Together, these five actors drive the core logic that is essential for the toolkit to function as intended.4. Now it's time to add in the various AI agents & interactive actors into the level. These include collectible Gems, Patrol Guards, Cameras, Motion Sensors, Automated Turrets, Gadget Pickups, etc.That's all there is to it. You should now have a fully functional level at your disposal. If you have any queries regarding the workflow, feel free to let me know in the comments section. If you're interested in the toolkit, it's now available for purchase through the Unreal Engine Marketplace: https://www.unrealengine.com/marketplace/top-down-stealth-toolkit
Tower Defense Starter Kit Tutorial: How to create a new level
1. First, ensure that the default Game Mode & Game Instance class parameters in the Project Settings are set to BP_GameMode & BP_GameInstance classes respectively.2. Now create a new map, open it, & lay down the navigable area for AI bots by placing instances of BP_AIPath across the level. Add a Navmesh Bounds Volume & extend it to encapsulate all the AIPath actors.3. Add a Lightmass Importance Volume around the core game space.4. The BP_AIPath actors that we added earlier have a secondary function apart from providing a traversal space for the AI bots. And that is to allow the deployment of Global Abilities like Airstrike. Every time a player selects a global ability, a targeting reticule is activated to point the precise location for deployment. However, if we have empty spaces in the level, this can be an issue as there will be no surfaces to block the GlobalAbilityTargeting trace channel. In order to rectify this issue, the toolkit provides a solution in the form of the BP_PlayerMovementBounds class. While it is an invisible actor, it can block all types of trace channels, & thus provide a platform for displaying the targeting reticule even when there are no underlying meshes under the cursor location.So the next step is to add a BP_PlayerMovementBounds actor just below the lowest point along the AI path. Now set it's 'Trace Blocker Extent' parameter to cover an area larger than the maximum visible game space as seen through the player camera. With that, we have essentially added the ability to use Global Abilities.Now onto the primary functaionality of this actor. As the name suggests, this actor can be used to restrict the movement of the player camera. The movable space for the camera is determined both by the 'Tracer Blocker Extent' value as well as the 'Blocker Bounds To Movement Bounds Ratio' attributes [check screenshot below]. For example, the default ratio of 3.0 to 1.0 ensures that the player camera can move around in an area corresponding to middle one-third of the space covered by the actor's trace blocker volume.5. Since we've already laid down the paths for the AI bots, the next step is to add spawn points along the paths for the enemy waves.Now the enemy spawn points are divided into two types based on the navigation model to be used by the AI:BP_SpawnPoint_NavMeshPathing for AI that uses UE4's navmesh to move towards their objectives. [You can use it's Designated Objective' parameter to define the objective actor for all AI bots spawned here]BP_SpawnPoint_SplinePathing for AI that relies on custom spline paths for their movement.As can be seen in the next screenshot, I've added a couple of spawn points (ignore the old details panel) in my level:6. Now add a BP_ExitPoint actor somewhere along the path to provide the primary movement target for the enemy AI bots.7. The toolkit uses modular grid generators to act as platforms for tower placement. But before adding them into the scene, first, add a BP_GridManager actor. Use it's publicly exposed variable 'Grid Cell Size' to specify the size of the individual grid cells across all grid generators in the level.8. With the grid cell size specified, it's now time to add the grid generators. Drag & drop instances of BP_GridGenerator into the level. The overall size of the generator in the X & Y (local space) directions can be specified through the 'Grid Count X' & 'Grid Count Y' variables as shown below:9. Now comes two of the most important classes required for the functioning of the toolkit: the Tower Manager & Wave Spawn Controller. Add the BP_TowerManager actor first. There's no need to make any changes to its default parameters as it will work right out of the box with those settings.Next, add an instance of either BP_BatchedWaveSpawnController or BP_WeightedWaveSpawnController to the level. The Batched Wave Spawn Controller if you want to have complete control over your AI waves. As the name suggests, it works by batching enemies of a similar type into classes so that you can easily set up your own waves. You can check out the following tutorial to learn how to define your own custom waves: The Weighted Wave Spawn Controller is a more randomized spawning system with less control over the exact nature of enemy waves. For more information on its working, you can check out the following article: Weighted Wave Spawn Controller Basics 10. Now onwards to the final step. All that's left to do is to add level bounds so that actors like projectiles that BP_LevelBounds can move across the level gets destroyed once they cross a certain threshold. Add six BP_LevelBounds actors along the outer periphery of the level, making sure that they form a cuboid shape to encapsulate the entire level.With that, you should have a fully functional level at your disposal. If you have any queries regarding the workflow, feel free to let me know in the comments section.If you're interested in the toolkit, you can purchase it through the Unreal Engine Marketplace: https://www.unrealengine.com/marketplace/tower-defense-starter-kit
Top Down Stealth Toolkit Basics: Global Alert Level System
The following information is based on the v2.1 edition of Top Down Stealth Toolkit & hence may not remain entirely relevant in later versions. For more information about the toolkit, check out the official support thread in the Unreal Engine forums: https://forums.unrealengine.com/unreal-engine/marketplace/68942-top-down-stealth-toolkit The v2.1 update for Top Down Stealth Toolkit introduced a Global Alert Level Controller (GALC) tasked with controlling high-level AI responses to the player's actions. While all AI agents still retain their individual alert level systems, the GALC acts as an autonomous outside listener that keeps track of stimulus perception events, with minimal coupling to existing systems within the toolkit. With every new instance of a stimulus being perceived by an AI agent, the Global Alert Meter keeps rising based on the perceived threat rating of the stimulus. As the meter crosses certain threshold values, the Global Alert Level system kicks into action by activating the designated responses associated with the new state. These could include activation of automated security devices like lasers & turrets, as well as the deployment of reinforcement patrol guards to ramp up the difficulty of the mission over time. The Global Alert Level Controller provides the following options to control the functioning of the new alert level system: 1. ThreatRatingModifier: Controls the rate at which Global Alert Meter goes up. While the perceived threat value of a stimulus is the major deciding factor when it comes to rising global alert levels, this modifier can help control the overall growth curve without having to tune the threat value of individual stimuli. 2. GlobalAlertLevelEscalationData: Determines how the game state changes in response to an increase in the Global Alert Level, based on the following parameters:AlertLevel: The Global Alert Level associated with the state.AlertMeterTarget: Minimum Global Alert Meter value required to reach the associated level.AlertSystemResponse: Determines the actions taken by the AI in response to an increase in the Global Alert Level. Every time the Global Alert Level reaches a new state, the 'ResponseType' associated with the same gets relayed to the AI Surveillance Controller class which processes the request & performs the necessary actions. While the default setup offers up to three types of responses (Idle, ActivateIdleAgents, & Deploy Backup agents), more customized response types can be easily integrated into the system through this parameter.AlertDisplayBackgroundColor: Determines the background color of the UI element that displays the Global Alert Level.
Thoughts on Game Design: XP Management Systems for Tower Defense Games
Two years ago, before I first started working on the Tower Defense Starter Kit, I had done some research on various games from the genre to understand the design process behind the underlying gameplay mechanics. Among the lot were popular games like Kingdom Rush & Defense Grid, as well as their lesser-known counterparts like Anomaly Defenders & Sentinel 4. Each of those games had some interesting unique feature that made them stand out & Sentinel 4 actually had quite a few of them going for it. Of special note among these, was the fact that towers leveled up on their own over time. This single design choice added a whole new layer of tactics when it came to tower placement, right from the early stages of a mission.A couple of months ago, I finally got around to adding an experience based auto-leveling system to the toolkit. While I thoroughly enjoyed Origin8's (Sentinel 4 Developer) take on tower defense, one aspect that concerned me was the decision to use a last-hit driven XP management system. I've never been a fan of this mechanic, as it kind of negates the impact of all other entities that were crucial to taking down a target. So I ended up experimenting with alternative solutions, hoping to find something that can level out the playing field. But before delving further into that topic, I'm going to point out the different options available to us, so as to provide a better idea about the reasoning behind the final decision. Last-Hit based XP Management System First, we have a last-hit based XP distribution system, whereby an entity gains experience by landing the killing blow on an enemy. It works especially well in games where the player controls only one character at any point in time, as the designer does not have to worry about sharing XP with other entities. It is also easy to implement & tweak since the experience gain directly depends on the player & enemy levels. However when multiple entities are vying for experience at the same time, distributing XP based on kills could lead to balance issues. To provide an example for such a scenario, let us consider two common types of towers in Tower Defense games: Laser towers & Sniper Towers. Laser towers do continuous damage against a single target & hence excel at clearing out waves of low HP targets with ease. Sniper towers, on the other hand, fire extremely powerful rounds at a single target with a considerably high reload time to boot. This makes them ineffective against fast moving crowds but can take down tougher opponents relatively quickly. Under normal circumstances, both seem to have their own advantages & disadvantages, & thus end up being useful under different types of scenarios. However, upon further inspection, I noticed that the advantages conferred by the system don't always hold up. When you factor in late-game scenarios, especially in endless wave modes, the Laser towers in spite of dealing a lot of damage, might not get a lot of kills. At the same time, the Sniper towers owing to their much higher burst damage output might have a higher chance of taking down targets. Over time, this could make them level up faster, thus increasing their damage output, which ends up getting them more kills, thus rendering them capable of dealing significant damage even as tougher enemies start spawning. This could lead to the Laser towers becoming less useful over time, as the damage output fails to keep up with increasing enemy HP. Of course, there is the option to provide XP to all towers in the vicinity of a fallen enemy, making sure to give an extra bonus to the tower that dealt the killing blow. But I did not want to go down that route unless there was no other choice. Damage based XP Management System Next, we have the less ubiquitous damage driven system which distributes XP for every instance of damage inflicted by a game entity. This ensures that every participant gains experience based on their individual contributions. However, it does come with its own set of problems. For one, it will increase the complexity of the process flow by a slight margin, as projectile based towers will have to listen to the impact data of every single projectile, before getting the XP returns. This extra effort might really not be worth it, since the player may not even notice its passive impact, among all the different actions that are required to be performed. The second, & more important issue is that certain AoE towers like Flamethrower towers might end up dealing huge amounts of damage against crowds, thus enabling them to level up much faster than their contemporaries. An additional side effect of this system is the possibility of not getting any XP from shots that miss. For example, Artillery towers fire slower arcing shots that can easily miss their mark, unless the target path has been anticipated before launching the projectile. All of these problems combined would entail the project requiring more amount of time dedicated to maintaining game balance. And that leads us to the third system. Output based XP Management System After coming to terms with the potential difficulties associated with both the aforementioned systems [when applied within the tower defense game space], I tried breaking down the issue to see if there is a way to come up with something that had the advantages of both while reducing the unnecessary side effects. The solution sprung up in the form of an output based experience gain system. In this scenario, towers gain experience at the end of every output cycle, irrespective of whether it led to the death of an enemy, or the amount of damage inflicted. In a game where the player is directly in charge of the output generation, this system, if not monitored, could be exploited easily to gain experience without actually engaging in combat. However, the towers being completely free of the player's volition are not capable of the same & can be trusted to always play a fair game. This new system essentially combines the reduced complexity of the first system with the evenly distributed experience gains of the second. As far as the player is concerned, the tower keeps accruing more XP everytime it attacks an enemy, albeit at different rates. Moreover, it provides a high level of predictability in terms of XP growth, as it's entirely reliant on the tower's rate of fire. By comparing the total output generated by each tower in a specified amount of time, the distribution of XP can be easily balanced by the designer. Even so, one disadvantage that pops out when compared to the first system is that the amount of XP gained per cycle does not depend upon the enemy levels. However, this can also be easily taken care of, by introducing the wave number/cycle data to act as an XP multiplier. And thus, after considering all the options, I ended up going for an output driven system for XP growth. While it may not be feasible for most types of games, it seems well-suited for use in the Tower Defense genre.
Top Down Stealth Toolkit Tutorial: How to control the AI Perception model for AI agents
The following information is based on the v2.0 edition of Top Down Stealth Toolkit & hence may not remain entirely relevant in later versions. For more information about the toolkit, check out the official support thread in the Unreal Engine forums: https://forums.unrealengine.com/showthread.php?97156-Top-Down-Stealth-ToolkitThe Top Down Stealth Toolkit uses a custom Perception system to evaluate potential threats for the AI agents. It's user defined properties can all be edited through the component details panel from the owning blueprint, thus facilitating creation of different threat perception models for different types of AI agents. This tutorial will go over the basic parameters that control the working of this system.Visual Perception Data:1. IsVisualPerceptionEnabled?: Determines if the agent can use Visual Perception to perceive stimuli.2. Vision Range: Determines the direct line of sight range.3. HalfVisionAngle: Determines the half angle for the agent's cone of vision.Aural Perception Data:1. IsAuralPerceptionEnabled?: Determines if the agent can use Aural Perception to perceive stimuli.2. HearingRange: Controls the maximum distance at which a sound of default loudness 1.0 (controlled through the 'PerceptionRangeModifier' parameter of interest stimuli: Breakdown of Stimulus Parameters) can be perceived by the agent.Intuitive Perception Data:1. IsIntuitivePerceptionEnabled?: Determines if the agent can use Intuitive Perception to perceive stimuli.Motion Perception Data:1. IsMotionPerceptionEnabled?: Determines if the agent can use Motion Perception to perceive a stimuli.2. DetectionRange: Determines the maximum range at which stimuli can be sensed.3. DetectionCounter: A counter that decides if the agent should respond to the stimulus [Used because this is an indirect form of perception]4. CounterIncrementPerCycle: Controls the rate at which the Detection Counter increases if the stimulus is in range.5. CounterDecrementPerCycle: Controls the rate at which the Detection Counter decreases once the stimulus goes out of range.6. CriticalDetectionValue: Determines the threshold point for Detection Counter, at which the agent responds to the stimulus.7. DisplayAlertnessLevel?: Determines if changes in the Detection Counter value need to be displayed in the game space.Can Perceive Interests?: Determines if the Agent can perceive Interest Stimuli. If set to false, only Target stimuli will be perceivable.Note: The Perception system has no awareness about it's owning actor & uses a custom interface to receive & send messages. So if you're using a custom AI agent, make sure that the owner class implements the BPI_SensorySystem interface & it's functions. For reference, check out the implementation of the interface functions in BP_PatrolGuard_Parent.
Carl (Director of Technology for Spotlight PA) and Wayne (Principal Engineer at GoDaddy) join Mat and Mark to talk about the new go:embed feature in Go 1.16. They discuss how and when to use it, common gotchas to watch out for, and some rather meaty unpopular opinions thrown in for good measure.Discuss on Changelog NewsJoin Changelog++ to support our work, get closer to the metal, and make the ads disappear!Sponsors Code-ish by Heroku A podcast from the team at Heroku, exploring code, technology, tools, tips, and the life of the developer. Check out episode 101 for a deep dive with Cornelia Davis (CTO of Weaveworks) on cloud native, cloud native patterns, and what is really means to be a cloud native application. Subscribe on Apple Podcasts and Spotify. Equinix Metal - Proximity Take your infrastructure further, faster. On March 3rd, join Equinix Metal for their first technical user conference called Proximity. It s a follow-the-sun day of live-streamed technical demonstrations showcasing Equinix Metal s partners and ecosystem. Visit metal.equinix.com/proximity Retool Retool makes it super simple to build back-office apps in hours, not days. The tool is is built by engineers, explicitly for engineers. Learn more and try it for free at retool.com/changelog Featuring Carl Johnson – Twitter, GitHub, Website Wayne Ashley Berry – Twitter, GitHub, Website Mat Ryer – Twitter, GitHub, LinkedIn, Website Mark Bates – Twitter, GitHub, Website Notes and LinksRead the official reference documentation on golang.orgCarl also wrote about How to Use //go:embedAnd learn more about its design in this Draft design video with Russ Cox
Here s a pretty useful idea for library authors and their users: there are better ways to test your code!I give three examples of how user projects can be self-tested without actually writing any real test cases by the end-user. One is hypothetical about django and two examples are real and working: featuring deal and dry-python/returns. A brief example with deal:import deal@deal.pre(lambda a, b: a >= 0 and b >= 0)@deal.raises(ZeroDivisionError) # this function can raise if `b=0`, it is okdef div(a: int, b: int) -> float: if a > 50: # Custom, in real life this would be a bug in our logic: raise Exception('Oh no! Bug happened!') return a / bThis bug can be automatically found by writing a single line of test code: test_div = deal.cases(div). As easy as it gets! From this article you will learn:How to use property-based testing on the next levelHow a simple decorator @deal.pre(lambda a, b: a >= 0 and b >= 0) can help you to generate hundreds of test cases with almost no effortWhat Monad laws as values is all about and how dry-python/returns helps its users to build their own monadsI really like this idea! And I would appreciate your feedback on it.Discuss on Changelog News
Adam Wathan reveals Tailwind s new JIT compiler:One of the hardest constraints we ve had to deal with as we ve improved Tailwind CSS over the years is the generated file size in development. With enough customizations to your config file, the generated CSS can reach 10mb or more, and there s only so much CSS that build tools and even the browser itself will comfortably tolerate.Today I m super excited to share a new project we ve been working on that makes this constraint a thing of the past: a just-in-time compiler for Tailwind CSS.Discuss on Changelog News
ClickHouse has rapidly rivaled other open source databases in active contributors
Lawrence Hecht:ClickHouse has come out of seemingly nowhere to rival Elasticsearch as the database-related open source software project with the most active contributors ClickHouse is column-oriented and allows for analytics reports to be generated using SQL queries in real-time. ClickHouse s rise in popularity began in 2016, which happens to be when Apache Spark s peak.I first heard of ClickHouse last year when I learned that our friends at Plausible use it for their analytics backend (teamed with Postgres for relational data).Discuss on Changelog News
O.G. Brian Ketelsen joins the panel to discuss code generation; programs that write programs. They also discuss IDLs, DSLs, overusing language features, generics, and more.Also Brian plays his guitar. Discuss on Changelog NewsJoin Changelog++ to support our work, get closer to the metal, and make the ads disappear!Sponsors Teleport Quickly access any resource anywhere using a Unified Access Plane that consolidates access controls and auditing across all environments - infrastructure, applications, and data. Try Teleport today in the cloud, self-hosted, or open source at goteleport.com LaunchDarkly Test in production! Deploy code at any time, even if a feature isn t ready to be released to your users. Wrap code in feature flags to get the safety to test new features and infrastructure in prod without impacting the wrong end users. Equinix Metal - Proximity Take your infrastructure further, faster. On March 3rd, join Equinix Metal for their first technical user conference called Proximity. It s a follow-the-sun day of live-streamed technical demonstrations showcasing Equinix Metal s partners and ecosystem. Visit metal.equinix.com/proximity Featuring Brian Ketelsen – Twitter, GitHub Mat Ryer – Twitter, GitHub, LinkedIn, Website Jon Calhoun – Twitter, GitHub, Website Kris Brandow – Twitter, GitHub Notes and LinksThe panel dig deep on code generation in Go. Touching on the new go:embed feature in Go 1.16.They also discuss IDLs (interface description language) and DSLs (domain specific languages) and the part they play in code generation.Brian talks about how we re all guilty of overusing language features, like channels (see Go channels are bad and you should feel bad for an example).The panel refers to https://litestream.io/ at one point, as an example of a closed-open-source project, and you can read more about on the Litestream GitHub page.
If you ve ever been alarmed by how many security vulnerabilities your Docker image has, even after you ve installed security updates, here s what s going on your image may actually be fine!Discuss on Changelog News
Practical AI 125: Deep learning technology for drug discovery
Our Slack community wanted to hear about AI-driven drug discovery, and we listened. Abraham Heifets from Atomwise joins us for a fascinating deep dive into the intersection of deep learning models and molecule binding. He describes how these methods work and how they are beginning to help create drugs for undruggable diseases!Discuss on Changelog NewsJoin Changelog++ to support our work, get closer to the metal, and make the ads disappear!Sponsors O'Reilly Media Learn by doing Python, data, AI, machine learning, Kubernetes, Docker, and more. Just open your browser and dive in. Learn more and keep your teams skills sharp at oreilly.com/changelog Code-ish by Heroku A podcast from the team at Heroku, exploring code, technology, tools, tips, and the life of the developer. Check out episode 98 and episode 99 for insights on the ethical and technical sides of deep fakes. Subscribe on Apple Podcasts and Spotify. The Brave Browser Browse the web up to 8x faster than Chrome and Safari, block ads and trackers by default, and reward your favorite creators with the built-in Basic Attention Token. Download Brave for free and give tipping a try right here on changelog.com. Featuring Abe Heifets – Twitter Chris Benson – Twitter, GitHub, LinkedIn, Website Daniel Whitenack – Twitter, GitHub, Website Notes and LinksAtomwiseAtomwise Receives a $2.3M Grant to Develop New Therapies for Drug Resistant Malaria and TuberculosisAtomwise Partners with Global Research Teams to Pursue Broad-Spectrum Treatments Against COVID-19 and Future Coronavirus OutbreaksWorld robotic soccerPhiladelphia chromosomeAlphafoldCanavan disease example:Paper: Discovery of Novel Inhibitors of a Critical Brain Enzyme Using a Homology Model and a Deep Convolutional Neural Network AtomNet: A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery[ Memorizing yesterday s stock price example](Most Ligand-Based Classification Benchmarks Reward Memorization Rather than Generalization)
`whereami` uses WiFi signals & ML to locate you (within 2-10 meters)
If you re adventurous and you want to learn to distinguish between couch #1 and couch #2 (i.e. 2 meters apart), it is the most robust when you switch locations and train in turn. E.g. first in Spot A, then in Spot B then start again with A. Doing this in spot A, then spot B and then immediately using predict will yield spot B as an answer usually. No worries, the effect of this temporal overfitting disappears over time. And, in fact, this is only a real concern for the very short distances. Just take a sample after some time in both locations and it should become very robust.The linked project was almost entirely copied from the find project, which was written in Go. It then went on to inspire whereami.js. I bet you can guess what that is.Discuss on Changelog News
Our journey from a Python monolith to a managed platform
Dropbox Engineering tells the tale of their new SOA:The majority of software developers at Dropbox contribute to server-side backend code, and all server side development takes place in our server monorepo. We mostly use Python for our server-side product development, with more than 3 million lines of code belonging to our monolithic Python server.It works, but we realized the monolith was also holding us back as we grew.This is an excellent, deep re-telling of their goals, decisions, setbacks, and progress. Here s the major takeaway, if you don t have time for a #longread:The single most important takeaway from this multi-year effort is that well-thought-out code composition, early in a project s lifetime, is essential. Otherwise, technical debt and code complexity compounds very quickly.Discuss on Changelog News
In this episode we explore how Clever started using Go. What technologies did Clever start with, how did they transition to Go, and what were the motivations behind those changes? We then explore some of the OS tech written by the team at Clever.Discuss on Changelog NewsJoin Changelog++ to support our work, get closer to the metal, and make the ads disappear!Sponsors Teleport Quickly access any resource anywhere using a Unified Access Plane that consolidates access controls and auditing across all environments - infrastructure, applications, and data. Try Teleport today in the cloud, self-hosted, or open source at goteleport.com LaunchDarkly Test in production! Deploy code at any time, even if a feature isn t ready to be released to your users. Wrap code in feature flags to get the safety to test new features and infrastructure in prod without impacting the wrong end users. Equinix Metal - Proximity Take your infrastructure further, faster. On March 3rd, join Equinix Metal for their first technical user conference called Proximity. It s a follow-the-sun day of live-streamed technical demonstrations showcasing Equinix Metal s partners and ecosystem. Visit metal.equinix.com/proximity Featuring Rafael Garcia – Twitter, GitHub Nathan Leiby – Twitter, GitHub, LinkedIn Jon Calhoun – Twitter, GitHub, Website Notes and Linkswag - a tool for generating Go web APIs using a subset of Swagger v2.sphinx - http rate limiting tool.leakybucket - leaky bucket implemented in Go.microplane - CLI used to make git changes across multiple repos.optimus - a library used to concurrently manipulate collections of data.reposyncgitbotMo Repos, Mo Problems? How We Make Changes Across Many Git Repositories - a writeup by Nathan about how Clever uses the microplane CLI.
Erik Kennedy s Gradient Generator is full-featured and (of course) he teaches you how to design beautiful, butter-smooth gradients on the same page.Discuss on Changelog News
While I was trying to identify why my-Go-based project took more than three times to execute than a similar Bash script (for a code-path that amounted to just a few stderr writes), I found that many of the Go packages (including some in the built-in library) have quite heavy static initializers, which due to how Go initialization works are always executed regardless if I use them for a particular code-path or not.Also, with the newly introduced GODEBUG=inittrace=1 in Go 1.16 developers can now investigate the cost of static initializers of their dependencies, thus I wanted to raise the awareness of this issue.Discuss on Changelog News
"Open source software can potentially increase EU s GDP by over $100 billion"
The the OpenForum Europe think tank conducted a study to highlight the potential benefit of embracing open source:To analyze the impact of open source software in terms of economics, OFE engaged economists who had prior experience illustrating the effect of technology in tangible terms.Here s how they calculated said benefit:the economists estimated that in 2018 there were at least 260,000 open source contributors in the EU. Together they produced a volume of code equivalent to the full-time work of 16,000 developers. In terms of economics, these contributions stood between 65 billion ($77.8 billion) and 95 billion ($113.7 billion).Based on this, the OFE report concluded that even an increase of 10% could potentially increase the EU s GDP by almost 100 billion ($120 billion) per year.Are these numbers 100% accurate? No. Are they provocative when considering open source impact? I think so.Discuss on Changelog News
Go Time s Mat Ryer joins Jerod, KBall, and Nick to play Story of the Week, Today I Learned, Unpopular Opinions, and Shout Outs!Discuss on Changelog NewsJoin Changelog++ to support our work, get closer to the metal, and make the ads disappear!Sponsors Strapi Open source headless CMS that frontenders love. It s 100% Javascript, fully customizable, and developer-first. Strapi is also enterprise-ready. Head to strapi.io/jsparty and click the Get started button for a step-by-step guide to create a sample app using create strapi-app. DevDiscuss An original podcast by team behind dev.to hosted by DEV co-founders Ben Halpern and Jess Lee. The podcast brings on notable industry guests to discuss trends and timeless software topics to help developers succeed within their teams and grow. Linode Get $100 in free credit to get started on Linode Linode is our cloud of choice and the home of Changelog.com. Head to linode.com/changelog OR text CHANGELOG to 474747 to get instant access to that $100 in free credit. Fastly Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com. Featuring Mat Ryer – Twitter, GitHub, LinkedIn, Website Jerod Santo – Twitter, GitHub Kevin Ball – Twitter, GitHub, LinkedIn, Website Nick Nisi – Twitter, GitHub, Website Notes and LinksStory of the WeekFaster JavaScript calls in V8Blitz.jsBlitz on JS PartyCitibank just got a $500 million lesson in the importance of UI designTILEmbedding SVG filters directly in CSSAccessing hardware devices on the webShout OutsfreeCodeCamp s DS Curriculumswyx on TwitterViteMiscellanyWe hear Go Time is pretty goodWatch our live recording on YouTube
Search inside YouTube videos using natural language
Use OpenAI s CLIP neural network to search inside YouTube videos. You can try it by running the notebook on Google Colab.The README has a bunch of examples of things you might search for and the results you d get back. ( The Transamerica Pyramid , anyone?)The author also has another related project where you can search Unsplash in like manner.Discuss on Changelog News
Epidemic - Simple Model of How a Disease Can Spread into a Population
In this post, you will see a very simple model of how a disease can spread into a population. You can get the code, be able to adjust the parameters and watch different outcomes.
I still remember my excitement when I learned how to build a hover-triggered submenu with just CSS. (It was probably after reading this 2003 article from A List Apart.) At the time, it was a true CSS trick. Seriously. …The post In Praise of the Unambiguous Click Menu appeared first on CSS-Tricks.You can support CSS-Tricks by being an MVP Supporter.
File this under stuff you don’t need to know just yet, but I think the :has CSS selector is going to have a big impact on how we write CSS in the future. In fact, if it ever ships in …The post Did You Know About the :has CSS Selector? appeared first on CSS-Tricks.You can support CSS-Tricks by being an MVP Supporter.
CSS-Tricks has covered how to break text that overflows its container before, but not much as much as you might think. Back in 2012, Chris penned Handling Long Words and URLs (Forcing Breaks, Hyphenation, Ellipsis, etc) and it is still …The post Better Line Breaks for Long URLs appeared first on CSS-Tricks.You can support CSS-Tricks by being an MVP Supporter.
Let s spin up a basic Svelte site and integrate Tailwind into it for styling. One advantage of working with Tailwind is that there isn t any context switching going back and forth between HTML and CSS, since you re applying styles as …The post How to Use Tailwind on a Svelte Site appeared first on CSS-Tricks.You can support CSS-Tricks by being an MVP Supporter.
How I Built my SaaS MVP With Fauna ($150 in revenue so far)
Are you a beginner coder trying to implement to launch your MVP? I’ve just finished my MVP of ReviewBolt.com, a competitor analysis tool. And it’s built using React + Fauna + Next JS. It’s my first paid SaaS tool …The post How I Built my SaaS MVP With Fauna ($150 in revenue so far) appeared first on CSS-Tricks.You can support CSS-Tricks by being an MVP Supporter.
The block editor was a game-changer for WordPress. The idea that we can create blocks of content and arrange them in a component-like fashion means we have a lot of flexibility in how we create content, as well a bunch …The post The WordPress Evolution Toward Full-Site Editing appeared first on CSS-Tricks.You can support CSS-Tricks by being an MVP Supporter.
In the classic 1986 essay, No Silver Bullet, Fred Brooks argued that there is, in some sense, not that much that can be done to improve programmer productivity. His line of reasoning is that programming tasks contain a core of essential/conceptual1 complexity that's fundamentally not amenable to attack by any potential advances in technology (such as languages or tooling). He then uses an Ahmdahl's law argument, saying that because 1/X of complexity is essential, it's impossible to ever get more than a factor of X improvement via technological improvements.Towards the end of the essay, Brooks claims that at least 1/2 (most) of complexity in programming is essential, bounding the potential improvement remaining for all technological programming innovations combined to, at most, a factor of 22:All of the technological attacks on the accidents of the software process are fundamentally limited by the productivity equation:Time of task = Sum over i { Frequency_i Time_i }If, as I believe, the conceptual components of the task are now taking most of the time, then no amount of activity on the task components that are merely the expression of the concepts can give large productivity gains.Let's see how this essential complexity claim holds for a couple of things I did recently at work:scp from a bunch of hosts to read and download logs, and then parse the logs to understand the scope of a problemQuery two years of metrics data from every instance of every piece of software my employer has, for some classes of software and then generate a variety of plots that let me understand some questions I have about what our software is doing and how it's using computer resourcesLogsIf we break this task down, we havescp logs from a few hundred thousand machines to a local boxused a Python script for this to get parallelism with more robust error handling than you'd get out of pssh/parallel-scp~1 minute to write the scriptdo other work while logs downloadparse downloaded logs (a few TB)used a Rust script for this, a few minutes to write (used Rust instead of Python for performance reasons here just opening the logs and scanning each line with idiomatic Python was already slower than I'd want if I didn't want to farm the task out to multiple machines)In 1986, perhaps I would have used telnet or ftp instead of scp. Modern scripting languages didn't exist yet (perl was created in 1987 and perl5, the first version that some argue is modern, was released in 1994), so writing code that would do this with parallelism and "good enough" error handling would have taken more than an order of magnitude more time than it takes today. In fact, I think just getting semi-decent error handling while managing a connection pool could have easily taken an order of magnitude longer than this entire task took me (not including time spent downloading logs in the background).Next up would be parsing the logs. It's not fair to compare an absolute number like "1 TB", so let's just call this "enough that we care about performance" (we'll talk about scale in more detail in the metrics example). Today, we have our choice of high-performance languages where it's easy to write, fast, safe code and harness the power of libraries (e.g., a regexp library3) that make it easy to write a quick and dirty script to parse and classify logs, farming out the work to all of the cores on my computer (I think Zig would've also made this easy, but I used Rust because my team has a critical mass of Rust programmers).In 1986, there would have been no comparable language, but more importantly, I wouldn't have been able to trivially find, download, and compile the appropriate libraries and would've had to write all of the parsing code by hand, turning a task that took a few minutes into a task that I'd be lucky to get done in an hour. Also, if I didn't know how to use the library or that I could use a library, I could easily find out how I should solve the problem on StackOverflow, which would massively reduce accidental complexity. Needless to say, there was no real equivalent to Googling for StackOverflow solutions in 1986.Moreover, even today, this task, a pretty standard programmer devops/SRE task, after at least an order of magnitude speedup over the analogous task in 1986, is still nearly entirely accidental complexity.If the data were exported into our metrics stack or if our centralized logging worked a bit differently, the entire task would be trivial. And if neither of those were true, but the log format were more uniform, I wouldn't have had to write any code after getting the logs; rg or ag would have been sufficient. If I look for how much time I spent on the essential conceptual core of the task, it's so small that it's hard to estimate.Query metricsWe really only need one counter-example, but I think it's illustrative to look at a more complex task to see how Brooks' argument scales for a more involved task. If you'd like to skip this lengthy example, click here to skip to the next section.We can view my metrics querying task as being made up of the following sub-tasks:Write a set of Presto SQL queries that effectively scan on the order of 100 TB of data each, from a data set that would be on the order of 100 PB of data if I didn't maintain tables that only contain a subset of data that's relevantMaybe 30 seconds to write the first query and a few minutes for queries to finish, using on the order of 1 CPU-year of CPU timeWrite some ggplot code to plot the various properties that I'm curious aboutNot sure how long this took; less time than the queries took to complete, so this didn't add to the total time of this taskThe first of these tasks is so many orders of magnitude quicker to accomplish today that I'm not even able to hazard a guess to as to how much quicker it is today within one or two orders of magnitude, but let's break down the first task into component parts to get some idea about the ways in which the task has gotten easier.It's not fair to port absolute numbers like 100 PB into 1986, but just the idea of having a pipeline that collects and persists comprehensive data analogous to the data I was looking at for a consumer software company (various data on the resource usage and efficiency of our software) would have been considered absurd in 1986. Here we see one fatal flaw in the concept of accidental essential complexity providing an upper bound on productivity improvements: tasks with too much accidental complexity wouldn't have even been considered possible. The limit on how much accidental complexity Brooks sees is really a limit of his imagination, not something fundamental.Brooks explicitly dismisses increased computational power as something that will not improve productivity ("Well, how many MIPS can one use fruitfully?", more on this later), but both storage and CPU power (not to mention network speed and RAM) were sources of accidental complexity so large that they bounded the space of problems Brooks was able to conceive of.In this example, let's say that we somehow had enough storage to keep the data we want to query in 1986. The next part would be to marshall on the order of 1 CPU-year worth of resources and have the query complete in minutes. As with the storage problem, this would have also been absurd in 19864, so we've run into a second piece of non-essential complexity so large that it would stop a person from 1986 from thinking of this problem at all.Next up would be writing the query. If I were writing for the Cray-2 and wanted to be productive, I probably would have written the queries in Cray's dialect of Fortran 77. Could I do that in less than 300 seconds per query? Not a chance; I couldn't even come close with Scala/Scalding and I think it would be a near thing even with Python/PySpark. This is the aspect where I think we see the smallest gain and we're still well above one order of magnitude here.After we have the data processed, we have to generate the plots. Even with today's technology, I think not using ggplot would cost me at least 2x in terms of productivity. I've tried every major plotting library that's supposedly equivalent (in any language) and every library I've tried either has multiple show-stopping bugs rendering plots that I consider to be basic in ggplot or is so low-level that I lose more than 2x productivity by being forced to do stuff manually that would be trivial in ggplot. In 2020, the existence of a single library already saves me 2x on this one step. If we go back to 1986, before the concept of the grammar of graphics and any reasonable implementation, there's no way that I wouldn't lose at least two orders of magnitude of time on plotting even assuming some magical workstation hardware that was capable of doing the plotting operations I do in a reasonable amount of time (my machine is painfully slow at rendering the plots; a Cray-2 would not be able to do the rendering in anything resembling a reasonable timeframe).The number of orders of magnitude of accidental complexity reduction for this problem from 1986 to today is so large I can't even estimate it and yet this problem still contains such a large fraction of accidental complexity that it's once again difficult to even guess at what fraction of complexity is essential. To write it all down all of the accidental complexity I can think of would require at least 20k words, but just to provide a bit of the flavor of the complexity, let me write down a few things.SQL; this is one of those things that's superficially simple but actually extremely complexAlso, Presto SQLArbitrary Presto limits, some of which are from Presto and some of which are from the specific ways we operate Presto and the version we're usingThere's an internal Presto data structure assert fail that gets triggered when I use both numeric_histogram and cross join unnest in a particular way. Because it's a waste of time to write the bug-exposing query, wait for it to fail, and then re-write it, I have a mental heuristic I use to guess, for any query that uses both constructs, whether or not I'll hit the bug and I apply it to avoid having to write two queries. If the heuristic applies, I'll instead write a more verbose query that's slower to execute instead of the more straightforward queryWe partition data by date, but Presto throws this away when I join tables, resulting in very large and therefore expensive joins when I join data across a long period of time even though, in principle, this could be a series of cheap joins; if the join is large enough to cause my query to blow up, I'll write what's essentially a little query compiler to execute day-by-day queries and then post-process the data as necessary instead of writing the naive queryThere are a bunch of cases where some kind of optimization in the query will make the query feasible without having to break the query across days (e.g., if I want to join host-level metrics data with the table that contains what cluster a host is in, that's a very slow join across years of data, but I also know what kinds of hosts are in which clusters, which, in some cases, lets me filter hosts out of the host-level metrics data that's in there, like core count and total memory, which can make the larger input to this join small enough that the query can succeed without manually partitioning the query)We have a Presto cluster that's "fast" but has "low" memory limits a cluster that's "slow" but has "high" memory limits, so I mentally estimate how much per-node memory a query will need so that I can schedule it to the right clusteretc.When, for performance reasons, I should compute the CDF or histogram in Presto vs. leaving it to the end for ggplot to computeHow much I need to downsample the data, if at all, for ggplot to be able to handle it, and how that may impact analysesArbitrary ggplot stuffroughly how many points I need to put in a scatterplot before I should stop using size = [number] and should switch to single-pixel plotting because plotting points as circles is too slowwhat the minimum allowable opacity for points isIf I exceed the maximum density where you can see a gradient in a scatterplot due to this limit, how large I need to make the image to reduce the density appropriately (when I would do this instead of using a heatmap deserves its own post)etc.All of the above is about tools that I use to write and examine queries, but there's also the mental model of all of the data issues that must be taken into account when writing the query in order to generate a valid result, which includes things like clock skew, Linux accounting bugs, issues with our metrics pipeline, issues with data due to problems in the underlying data sources, etc.etc.For each of Presto and ggplot I implicitly hold over a hundred things in my head to be able to get my queries and plots to work and I choose to use these because these are the lowest overhead tools that I know of that are available to me. If someone asked me to name the percentage of complexity I had to deal with that was essential, I'd say that it was so low that there's no way to even estimate it. For some queries, it's arguably zero my work was necessary only because of some arbitrary quirk and there would be no work to do without the quirk. But even in cases where some kind of query seems necessary, I think it's unbelievable that essential complexity could have been more than 1% of the complexity I had to deal with.Revisiting Brooks on computer performance, even though I deal with complexity due to the limitations of hardware performance in 2020 and would love to have faster computers today, Brooks wrote off faster hardware as pretty much not improving developer productivity in 1986:What gains are to be expected for the software art from the certain and rapid increase in the power and memory capacity of the individual workstation? Well, how many MIPS can one use fruitfully? The composition and editing of programs and documents is fully supported by today s speeds. Compiling could stand a boost, but a factor of 10 in machine speed would surely . . .But this is wrong on at least two levels. First, if I had access to faster computers, a huge amount of my accidental complexity would go away (if computers were powerful enough, I wouldn't need complex tools like Presto; I could just run a query on my local computer). We have much faster computers now, but it's still true that having faster computers would make many involved engineering tasks trivial. As James Hague notes, in the mid-80s, writing a spellchecker was a serious engineering problem due to performance constraints.Second, (just for example) ggplot only exists because computers are so fast. A common complaint from people who work on performance is that tool X has somewhere between two and ten orders of magnitude of inefficiency when you look at the fundamental operations it does vs. the speed of hardware today5. But what fraction of programmers can realize even one half of the potential performance of a modern multi-socket machine? I would guess fewer than one in a thousand and I would say certainly fewer than one in a hundred. And performance knowledge isn't independent of other knowledge controlling for age and experience, it's negatively correlated with knowledge of non-"systems" domains since time spent learning about the esoteric accidental complexity necessary to realize half of the potential of a computer is time spent not learning about "directly" applicable domain knowledge. When we look software that requires a significant amount of domain knowledge (e.g., ggplot) or that'slarge enough that it requires a large team to implement (e.g., IntelliJ6), the vast majority of it wouldn't exist if machines were orders of magnitude slower and writing usable software required wringing most of the performance out of the machine. Luckily for us, hardware has gotten much faster, allowing the vast majority of developers to ignore performance-related accidental complexity and instead focus on all of the other accidental complexity necessary to be productive today.Faster computers both reduce the amount of accidental complexity tool users run into as well as the amount of accidental complexity that tool creators need to deal with, allowing more productive tools to come into existence.SummaryTo summarize, Brooks states a bound on how much programmer productivity can improve. But, in practice, to state this bound correctly, one would have to be able to conceive of problems that no one would reasonably attempt to solve due to the amount of friction involved in solving the problem with current technologies.Without being able to predict the future, this is impossible to estimate. If we knew the future, it might turn out that there's some practical limit on how much computational power or storage programmers can productively use, bounding the resources available to a programmer, but getting a bound on the amount of accidental complexity would still require one to correctly reason about how programmers are going to be able to use zillions times more resources than are available today, which is so difficult we might as well call it impossible.Moreover, for each class of tool that could exist, one would have to effectively anticipate all possible innovations. Brooks' strategy for this was to look at existing categories of tools and state, for each, that they would be ineffective or that they were effective but played out. This was wrong not only because it underestimated gains from classes of tools that didn't exist yet, weren't yet effective, or he wasn't familiar with (e.g., he writes off formal methods, but it doesn't even occur to him to mention fuzzers, static analysis tools that don't fully formally verify code, tools like valgrind, etc.) but also because Brooks thought that every class of tool where there was major improvement was played out and it turns out that none of them were (e.g., programming languages, which Brooks wrote just before the rise of "scripting languages" as well as just before GC langauges took over the vast majority of programming).In some sense, this isn't too different from when we looked at Unix and found the Unix mavens saying that we should write software like they did in the 70s and that the languages they invented are as safe as any language can be. Long before computers were invented, elders have been telling the next generation that they've done everything that there is to be done and that the next generation won't be able to achieve more. Even without knowing any specifics about programming, we can look at how well these kinds of arguments have held up historically and have decent confidence that the elders are not, in fact, correct this time.Looking at the specifics with the benefit of hindsight, we can see that Brooks' 1986 claim that we've basically captured all the productivity gains high-level languages can provide isn't too different from an assembly language programmer saying the same thing in 1955, thinking that assembly is as good as any language can be7 and that his claims about other categories are similar. The main thing these claims demonstrate are a lack of imagination. When Brooks referred to conceptual complexity, he was referring to complexity of using the conceptual building blocks that Brooks was familiar with in 1986 (on problems that Brooks would've thought of as programming problems). There's no reason anyone should think that Brooks' 1986 conception of programming is fundamental any more than they should think that how an assembly programmer from 1955 thought was fundamental. People often make fun of the apocryphal "640k should be enough for anybody" quote, but Brooks saying that, across all categories of potential productivity improvement, we've done most of what's possible to do, is analogous and not apocryphal!We've seen that, if we look at the future, the fraction of complexity that might be accidental is effectively unbounded. One might argue that, if we look at the present, these terms wouldn't be meaningless. But, while this will vary by domain, I've personally never worked on a non-trivial problem that isn't completely dominated by accidental complexity, making the concept of essential complexity meaningless on any problem I've worked on that's worth discussing.Thanks to Peter Bhat Harkins, Ben Kuhn, Yuri Vishnevsky, Chris Granger, Wesley Aptekar-Cassels, Lifan Zeng, Scott Wolchok, Martin Horenovsky, @realcmb, Kevin Burke, Aaron Brown, and Saul Pwanson for comments/corrections/discussion.The accidents I discuss in the next section. First let us consider the essenceThe essence of a software entity is a construct of interlocking concepts: data sets, relationships among data items, algorithms, and invocations of functions. This essence is abstract, in that the conceptual construct is the same under many different representations. It is nonetheless highly precise and richly detailed.I believe the hard part of building software to be the specification, design, and testing of this conceptual construct, not the labor of representing it and testing the fidelity of the representation. We still make syntax errors, to be sure; but they are fuzz compared to the conceptual errors in most systems. [return]Curiously, he also claims, in the same essay, that no individual improvement can yield a 10x improvement within one decade. While this technically doesn't contradict his Ahmdal's law argument plus the claim that "most" (i.e., at least half) of complexity is essential/conceptual, it's unclear why he would include this claim as well.When Brooks revisited his essay in 1995 in No Silver Bullet Refired, he claimed that he was correct by using the weakest form of the three claims he made in 1986, that within one decade, no single improvement would result in an order of magnitude improvement. However, he did then re-state the strongest form of the claim he made in 1986 and made it again in 1995, saying that this time, no set of technological improvements could improve productivity more than 2x, for real:It is my opinion, and that is all, that the accidental or representational part of the work is now down to about half or less of the total. Since this fraction is a question of fact, its value could in principle be settled by measurement. Failing that, my estimate of it can be corrected by better informed and more current estimates. Significantly, no one who has written publicly or privately has asserted that the accidental part is as large as 9/10.By the way, I find it interesting that he says that no one disputed this 9/10ths figure. Per the body of this post, I would put it at far above 9/10th for my day-to-day work and, if I were to try to solve the same problems in 1986, the fraction would have been so high that people wouldn't have even conceived of the problem. As a side effect of having worked in hardware for a decade, I've also done work that's not too different from what some people faced in 1986 (microcode, assembly & C written for DOS) and I would put that work as easily above 9/10th as well.Another part of his follow-up that I find interesting is that he quotes Harel's "Biting the Silver Bullet" from 1992, which, among other things, argues that that decade deadline for an order of magnitude improvement is arbitrary. Brooks' response to this isThere are other reasons for the decade limit: the claims made for candidate bullets all have had a certain immediacy about them . . . We will surely make substantial progress over the next 40 years; an order of magnitude over 40 years is hardly magical.But by Brooks' own words when he revisits the argument in 1995, if 9/10th of complexity is essential, it would be impossible to get more than an order of magnitude improvement from reducing it, with no caveat on the timespan:"NSB" argues, indisputably, that if the accidental part of the work is less than 9/10 of the total, shrinking it to zero (which would take magic) will not give an order of magnitude productivity improvement.Both his original essay and the 1995 follow-up are charismatically written and contain a sort of local logic, where each piece of the essay sounds somewhat reasonable if you don't think about it too hard and you forget everything else the essay says. As with the original, a pedant could argue that this is technically not incoherent after all, Brooks could be saying:at most 9/10th of complexity is accidental (if we ignore the later 1/2 claim, which is the kind of suspension of memory/disbelief one must do to read the essay)it would not be surprising for us to eliminate 100% of accidental complexity after 40 yearsWhile this is technically consistent (again, if we ignore the part that's inconsistent) and is a set of claims one could make, this would imply that 40 years from 1986, i.e., in 2026, it wouldn't be implausible for there to be literally zero room for any sort of productivity improvement from tooling, languages, or any other potential source of improvement. But this is absurd. If we look at other sections of Brooks' essay and combine their reasoning, we see other inconsistencies and absurdities. [return]Another issue that we see here is Brooks' insistence on bright-line distinctions between categories. Essential vs. accidental complexity. "Types" of solutions, such as languages vs. "build vs. buy", etc.Brooks admits that "build vs. buy" is one avenue of attack on essential complexity. Perhaps he would agree that buying a regexp package would reduce the essential complexity since that would allow me to avoid keeping all of the concepts associated with writing a parser in my head for simple tasks. But what if, instead of buying regexes, I used a language where they're bundled into the standard library or is otherwise distributed with the language? Or what if, instead of having to write my own concurrency primitives, those are bundled into the language? Or for that matter, what about an entire HTTP server? There is no bright-line distinction between what's in a library one can "buy" (for free in many cases nowadays) and one that's bundled into the language, so there cannot be a bright-line distinction between what gains a language provides and what gains can be "bought". But if there's no bright-line distinction here, then it's not possible to say that one of these can reduce essential complexity and the other can't and maintain a bright-line distinction between essential and accidental complexity (in a response to Brooks, Harel argued against there being a clear distinction in a response, and Brooks' response was to say that there there is, in fact, a bright-line distinction, although he provided no new argument).Brooks' repeated insistence on these false distinctions means that the reasoning in the essay isn't composable. As we've already seen in another footnote, if you take reasoning from one part of the essay and apply it alongside reasoning from another part of the essay, it's easy to create absurd outcomes and sometimes outright contradictions.I suspect this is one reason discussions about essential vs. accidental complexity are so muddled. It's not just that Brooks is being vague and handwave-y, he's actually not self-consistent, so there isn't and cannot be a coherent takeaway. Michael Feathers has noted that people are generally not able to correct identify essential complexity; as he says, One person s essential complexity is another person s accidental complexity.. This is exactly what we should expect from the essay, since people who have different parts of it in mind will end up with incompatible views.This is also a problem when critisizing Brooks. Inevitably, someone will say that what Brooks really meant was something completely different. And that will be true. But Brooks will have meant something completely different while also having meant the things he said that I mention. In defense of the view I'm presenting in the body of the text here, it's a coherent view that one could have had in 1986. Many of Brooks' statements don't make sense even when considered as standalone statements, let alone when cross-referenced with the rest of his essay. For example, the statement that no single development will result in an order of magnitude improvement in the next decade. This statement is meaningless as Brooks does not define and no one can definitively say what a "single improvement" is. And, as mentioned above, Brooks' essay reads quite oddly and basically does not make sense if that's what he's trying to claim. Another issue with most other readings of Brooks is that those are positions that are also meaningless even if Brooks had done the work to make them well defined. Why does it matter if one single improvement or two result in an order of magnitude improvement. If it's two improvements, we'll use them both. [return]Let's arbitrarily use a Motorola 68k processor with an FP co-processor that could do 200 kFLOPS as a reference for how much power we might have in a consumer CPU (FLOPS is a bad metric for multiple reasons, but this is just to get an idea of what it would take to get 1 CPU-year of computational resources, and Brooks himself uses MIPS as a term as if it's meaningful). By comparison, the Cray-2 could achieve 1.9 GFLOPS, or roughly 10000x the performance (I think actually less if we were to do a comparable comparison instead of using non-comparable GFLOPS numbers, but let's be generous here). There are 525600 / 5 = 105120 five minute periods in a year, so to get 1 CPU year's worth of computation in five minutes we'd need 105120 / 10000 = 10 Cray-2s per query, not including the overhead of aggregating results across Cray-2s.It's unreasonable to think that a consumer software company in 1986 would have enough Cray-2s lying around to allow for any random programmer to quickly run CPU years worth of queries whenever they wanted to do some data analysis. One sources claims that 27 Cray-2s were ever made over the production lifetime of the machine (1985 to 1990). Even if my employer owned all of them and they were all created by 1986, that still wouldn't be sufficient to allow the kind of ad hoc querying capacity that I have access to in 2020.Today, someone at a startup can even make an analogous argument when comparing to a decade ago. You used to have to operate a cluster that would be prohibitively annoying for a startup to operate unless the startup is very specialized, but you can now just use Snowflake and basically get Presto but only pay for the computational power you use (plus a healthy markup) instead of paying to own a cluster and for all of the employees necessary to make sure the cluster is operable. [return]I actually run into one of these every time I publish a new post. I write my posts in Google docs and then copy them into emacs running inside tmux running inside Alacritty. My posts are small enough to fit inside L2 cache, so I could have 64B/3.5 cycle write bandwidth. And yet, the copy+paste operation can take ~1 minute and is so slow I can watch the text get pasted in. Since my chip is working super hard to make sure the copy+paste happens, it's running at its full non-turbo frequency of 4.2Ghz, giving it 76.8GB/s of write bandwidth. For a 40kB post, 1 minute = 666B/s. 76.8G/666 =~ 8 orders of magnitude left on the table. [return]In this specific case, I'm sure somebody will argue that Visual Studio was quite nice in 2000 and ran on much slower computers (and the debugger was arguably better than it is in the current version). But there was no comparable tool on Linux, nor was there anything comparable to today's options in the VSCode-like space of easy-to-learn programming editor that provides programming-specific facilities (as opposed to being a souped up version of notepad) without being a full-fledged IDE. [return]And by the way, this didn't only happen in 1955. I've worked with people who, this century, told me that assembly is basically as productive as any high level language. This probably sounds ridiculous to almost every reader of this blog, but if you talk to people who spend all day writing microcode or assembly, you'll occasionally meet somebody who believes this.Thinking that the tools you personally use are as good as it gets is an easy trap to fall into. [return]
How do cars fare in crash tests they're not specifically optimized for?
Any time you have a benchmark that gets taken seriously, some people will start gaming the benchmark. Some famous examples in computing are the CPU benchmark specfp and video game benchmarks. With specfp, Sun managed to increase its score on 179.art (a sub-benchmark of specfp) by 12x with a compiler tweak that essentially re-wrote the benchmark kernel, which increased the Sun UltraSPARC s overall specfp score by 20%. At times, GPU vendors have added specialized benchmark-detecting code to their drivers that lowers image quality during benchmarking to produce higher benchmark scores. Of course, gaming the benchmark isn't unique to computing and we see people do this in other fields. It s not surprising that we see this kind of behavior since improving benchmark scores by cheating on benchmarks is much cheaper (and therefore higher ROI) than improving benchmark scores by actually improving the product.As a result, I'm generally suspicious when people take highly specific and well-known benchmarks too seriously. Without other data, you don't know what happens when conditions aren't identical to the conditions in the benchmark. With GPU and CPU benchmarks, it s possible for most people to run the standard benchmarks with slightly tweaked conditions. If the results change dramatically for small changes to the conditions, that s evidence that the vendor is, if not cheating, at least shading the truth.Benchmarks of physical devices can be more difficult to reproduce. Vehicle crash tests are a prime example of this -- they're highly specific and well-known benchmarks that use up a car for some test runs.While there are multiple organizations that do crash tests, they each have particular protocols that they follow. Car manufacturers, if so inclined, could optimize their cars for crash test scores instead of actual safety. Checking to see if crash tests are being gamed with hyper-specific optimizations isn't really feasible for someone who isn't a billionaire. The easiest way we can check is by looking at what happens when new tests are added since that lets us see a crash test result that manufacturers weren't optimizing for just to get a good score.While having car crash test results is obviously better than not having them, the results themselves don't tell us what happens when we get into an accident that doesn't exactly match a benchmark. Unfortunately, if we get into a car accident, we don't get to ask the driver of the vehicle we're colliding with to change their location, angle of impact, and speed, in order for the collision to comply with an IIHS, NHTSA, or *NCAP, test protocol.For this post, we're going to look at IIHS test scores when they added the (driver side) small overlap and passenger side small overlap tests, which were added in 2012, and 2018, respectively. We'll start with a summary of the results and then discuss what those results mean and other factors to consider when evaluating car safety, followed by details of the methodology.ResultsThe ranking below is mainly based on how well vehicles scored when the driver-side small overlap test was added in 2012 and how well models scored when they were modified to improve test results.Tier 1: good without modificationsVolvoTier 2: mediocre without modifications; good with modificationsNoneTier 3: poor without modifications; good with modificationsMercedesBMWTier 4: poor without modifications; mediocre with modificationsHondaToyotaSubaruChevroletTeslaFordTier 5: poor with modifications or modifications not madeHyundaiDodgeNissanJeepVolkswagenThese descriptions are approximations. Honda, Ford, and Tesla are the poorest fits for these descriptions, with Ford arguably being halfway in between Tier 4 and Tier 5 but also arguably being better than Tier 4 and not fitting into the classification and Honda and Tesla not really properly fitting into any category (with their category being the closest fit), but some others are also imperfect. Details below.General commentaryIf we look at overall mortality in the U.S., there's a pretty large age range for which car accidents are the leading cause of death. Although the numbers will vary depending on what data set we look at, when the driver-side small overlap test was added, the IIHS estimated that 25% of vehicle fatalities came from small overlap crashes. It's also worth noting that small overlap crashes were thought to be implicated in a significant fraction of vehicle fatalities at least since the 90s; this was not a novel concept in 2012.Despite the importance of small overlap crashes, from looking at the results when the IIHS added the driver-side and passenger-side small overlap tests in 2012 and 2018, it looks like almost all car manufacturers were optimizing for benchmark and not overall safety. Except for Volvo, all carmakers examined produced cars that fared poorly on driver-side small overlap crashes until the driver-side small overlap test was added.When the driver-side small overlap test was added in 2012, most manufacturers modified their vehicles to improve driver-side small overlap test scores. However, until the IIHS added a passenger-side small overlap test in 2018, most manufacturers skimped on the passenger side. When the new test was added, they beefed up passenger safety as well. To be fair to car manufacturers, some of them got the hint about small overlap crashes when the driver-side test was added in 2012 and did not need to make further modifications to score well on the passenger-side test, including Mercedes, BMW, and Tesla (and arguably a couple of others, but the data is thinner in the other cases; Volvo didn't need a hint).Other benchmark limitationsThere are a number of other areas where we can observe that most car makers are optimizing for benchmarks at the expensive of safety.Gender, weight, and heightAnother issue is crash test dummy overfitting. For a long time, adult NHSTA and IIHS tests used a 1970s 50%-ile male dummy, which is 5'9" and 171lbs. Regulators called for a female dummy in 1980 but due to budget cutbacks during the Reagan era, initial plans were shelved and the NHSTA didn't put one in a car until 2003. The female dummy is a scaled down version of the male dummy, scaled down to 5%-ile 1970s height and weight (4'11", 108lbs; another model is 4'11", 97lbs). In frontal crash tests, when a female dummy is used, it's always a passenger (a 5%-ile woman is in the driver's seat in one NHSTA side crash test and the IIHS side crash test). For reference, in 2019, the average weight of a U.S. adult male was 198 lbs and the average weight of a U.S. adult female was 171 lbs.Using a 1970s U.S. adult male crash test dummy causes a degree of overfitting for 1970s 50%-ile men. For example, starting in the 90s, manufacturers started adding systems to protect against whiplash. Volvo and Toyota use a kind of system that reduces whiplash in men and women and appears to have slightly more benefit for women. Most car makers use a kind of system that reduces whiplash in men but, on average, has little impact on whiplash injuries in women.It appears that we also see a similar kind of optimization for crashes in general and not just whiplash. We don't have crash test data on this, and looking at real-world safety data is beyond the scope of this post, but I'll note that, until around the time the NHSTA put the 5%-ile female dummy into some crash tests, most car manufacturers not named Volvo had a significant fatality rate differential in side crashes based on gender (with men dying at a lower rate and women dying at a higher rate).Volvo claims to have been using computer models to simulate what would happen if women (including pregnant women) are involved in a car accident for decades.Other crashesVolvo is said to have a crash test facility where they do a number of other crash tests that aren't done by testing agencies. A reason that they scored well on the small overlap tests when they were added is that they were already doing small overlap crash tests before the IIHS started doing small overlap crash tests.Volvo also says that they test rollovers (the IIHS tests roof strength and the NHSTA computes how difficult a car is to roll based on properties of the car, but neither tests what happens in a real rollover accident), rear collisions (Volvo claims these are especially important to test if there are children in the 3rd row of a 3-row SUV), and driving off the road (Volvo has a "standard" ditch they use; they claim this test is important because running off the road is implicated in a large fraction of vehicle fatalities).If other car makers do similar tests, I couldn't find much out about the details. Based on crash test scores, it seems like they weren't doing or even considering small overlap crash tests before 2012. Based on how many car makers had poor scores when the passenger side small overlap test was added in 2018, I think it would be surprising if other car makers had a large suite of crash tests they ran that aren't being run by testing agencies, but it's theoretically possible that they do and just didn't include a passenger side small overlap test.CaveatsWe shouldn't overgeneralize from these test results. As we noted above, crash test results test very specific conditions. As a result, what we can conclude when a couple new crash tests are added is also very specific. Additionally, there are a number of other things we should keep in mind when interpreting these results.Limited sample sizeOne limitation of this data is that we don't have results for a large number of copies of the same model, so we're unable to observe intra-model variation, which could occur due to minor, effectively random, differences in test conditions as well as manufacturing variations between different copies of same model. We can observe that these do matter since some cars will see different results when two copies of the same model are tested. For example, here's a quote from the IIHS report on the Dodge Dart:The Dodge Dart was introduced in the 2013 model year. Two tests of the Dart were conducted because electrical power to the onboard (car interior) cameras was interrupted during the first test. In the second Dart test, the driver door opened when the hinges tore away from the door frame. In the first test, the hinges were severely damaged and the lower one tore away, but the door stayed shut. In each test, the Dart s safety belt and front and side curtain airbags appeared to adequately protect the dummy s head and upper body, and measures from the dummy showed little risk of head and chest injuries.It looks like, had electrical power to the interior car cameras not been disconnected, there would have been only one test and it wouldn't have become known that there's a risk of the door coming off due to the hinges tearing away. In general, we have no direct information on what would happen if another copy of the same model were tested.Using IIHS data alone, one thing we might do here is to also consider results from different models made by the same manufacturer (or built on the same platform). Although this isn't as good as having multiple tests for the same model, test results between different models from the same manufacturer are correlated and knowing that, for example, a 2nd test of a model that happened by chance showed significantly worse results should probably reduce our confidence in other test scores from the same manufacturer. There are some things that complicate this, e.g., if looking at Toyota, the Yaris is actually a re-branded Mazda2, so perhaps that shouldn't be considered as part of a pooled test result, and doing this kind of statistical analysis is beyond the scope of this post.Actual vehicle tested may be differentAlthough I don't think this should impact the results in this post, another issue to consider when looking at crash test results is how results are shared between models. As we just saw, different copies of the same model can have different results. Vehicles that are somewhat similar are often considered the same for crash test purposes and will share the same score (only one of the models will be tested).For example, this is true of the Kia Stinger and the Genesis G70. The Kia Stinger is 6" longer than the G70 and a fully loaded AWD Stinger is about 500 lbs heavier than a base-model G70. The G70 is the model that IIHS tested -- if you look up a Kia Stinger, you'll get scores for a Stinger with a note that a base model G70 was tested. That's a pretty big difference considering that cars that are nominally identical (such as the Dodge Darts mentioned above) can get different scores.Quality may change over timeWe should also be careful not to overgeneralize temporally. If we look at crash test scores of recent Volvos (vehicles on the Volvo P3 and Volvo SPA platforms), crash test scores are outstanding. However, if we look at Volvo models based on the older Ford C1 platform1, crash test scores for some of these aren't as good (in particular, while the S40 doesn't score poorly, it scores Acceptable in some categories instead of Good across the board). Although Volvo has had stellar crash test scores recently, this doesn't mean that they have always had or will always have stellar crash test scores.Models may vary across marketsWe also can't generalize across cars sold in different markets, even for vehicles that sound like they might be identical. For example, see this crash test of a Nissan NP300 manufactured for sale in Europe vs. a Nissan NP300 manufactured for sale in Africa. Since European cars undergo EuroNCAP testing (similar to how U.S. cars undergo NHSTA and IIHS testing), vehicles sold in Europe are optimized to score well on EuroNCAP tests. Crash testing cars sold in Africa has only been done relatively recently, so car manufacturers haven't had PR pressure to optimize their cars for benchmarks and they'll produce cheaper models or cheaper variants of what superficially appear to be the same model. This appears to be no different from what most car manufacturers do in the U.S. or Europe -- they're optimizing for cost as long as they can do that without scoring poorly on benchmarks. It's just that, since there wasn't an African crash test benchmark, that meant they could go all-in on the cost side of the cost-safety tradeoff2.This report compared U.S. and European car models and found differences in safety due to differences in regulations. They found that European models had lower injury risk in frontal/side crashes and that driver-side mirrors were designed in a way that reduced the risk of lane-change crashes relative to U.S. designs and that U.S. vehicles were safer in rollovers and had headlamps that made pedestrians more visible.Non-crash testsOver time, more and more of the "low hanging fruit" from crash safety has been picked, making crash avoidance relatively more important. Tests of crash mitigation are relatively primitive compared to crash tests and we've seen that crash tests had and have major holes. One might expect, based on what we've seen with crash tests, that Volvo has a particularly good set of tests they use for their crash avoidance technology (traction control, stability control, automatic braking, etc.), but I don't know of any direct evidence for that.Crash avoidance becoming more important might also favor Tesla, since they seem more aggressive about pushing software updates (so people wouldn't have to buy a newer model to get improved crash avoidance) and it's plausible that they use real-world data from their systems to inform crash avoidance in a way that most car companies don't, but I also don't know of any direct evidence of this.Scores of vehicles of different weights aren't comparableA 2700lb subcompact vehicle that scores Good may fare worse than a 5000lb SUV that scores Acceptable. This is because the small overlap tests involve driving the vehicle into a fixed obstacle, as opposed to a reference vehicle or vehicle-like obstacle of a specific weight. This is, in some sense, equivalent to crashing the vehicle into a vehicle of the same weight, so it's as if the 2700lb subcompact was tested by running it into a 2700lb subcompact and the 5000lb SUV was tested by running it into another 5000 lb SUV.How to increase confidenceWe've discussed some reasons we should reduce our confidence in crash test scores. If we wanted to increase our confidence in results, we could look at test results from other test agencies and aggregate them and also look at public crash fatality data (more on this later). I haven't looked at the terms and conditions of scores from other agencies, but one complication is that the IIHS does not allow you to display the result of any kind of aggregation if you use their API or data dumps (I, time consumingly, did not use their API for this post because of that).Using real life crash dataPublic crash fatality data is complex and deserves its own post. In this post, I'll note that, if you look at the easiest relevant data for people in the U.S., this data does not show that Volvos are particularly safe (or unsafe). For example, if we look at this report from 2017, which covers models from 2014, two Volvo models made it into the report and both score roughly middle of the pack for their class. In the previous report, one Volvo model is included and it's among the best in its class, in the next, one Volvo model is included and it's among the worst in its class. We can observe this kind of variance for other models, as well. For example, among 2014 models, the Volkswagen Golf had one of the highest fatality rates for all vehicles (not just in its class). But among 2017 vehicles, it had among the lowest fatality rates for all vehicles. It's unclear how much of that change is from random variation and how much is because of differences between a 2014 and 2017 Volkswagen Golf.Overall, it seems like noise is a pretty important factor in results. And if we look at the information that's provided, we can see a few things that are odd. First, there are a number of vehicles where the 95% confidence interval for the fatality rate runs from 0 to N. We should have pretty strong priors that there was no 2014 model vehicle that was so safe that the probability of being killed in a car accident was zero. If we were taking a Bayesian approach (though I believe the authors of the report are not), and someone told us that the uncertainty interval for the true fatality rate of a vehicle had a >= 5% of including zero, we would say that either we should use a more informative prior or we should use a model that can incorporate more data (in this case, perhaps we could try to understand the variance between fatality rates of different models in the same class and then use the base rate of fatalities for the class as a prior, or we could incorporate information from other models under the same make if those are believed to be correlated).Some people object to using informative priors as a form of bias laundering, but we should note that the prior that's used for the IIHS analysis is not completely uninformative. All of the intervals reported stop at zero because they're using the fact that a vehicle cannot create life to bound the interval at zero. But we have information that's nearly as strong that no 2014 vehicle is so safe that the expected fatality rate is zero, using that information is not fundamentally different from capping the interval at zero and not reporting negative numbers for the uncertainty interval of the fatality rate.Also, the IIHS data only includes driver fatalities. This is understandable since that's the easiest way to normalize for the number of passengers in the car, but it means that we can't possibly see the impact of car makers not improving passenger small-overlap safety until the passenger-side small overlap test was added in 2018, the result of lack of rear crash testing for the case Volvo considers important (kids in the back row of a 3rd row SUV), etc.We can also observe that, in the IIHS analysis, many factors that one might want to control for aren't (e.g., miles driven isn't controlled for, which will make trucks look relatively worse and luxury vehicles look relatively better, rural vs. urban miesl driven also isn't controlled for, which will also have the same directional impact). One way to see that the numbers are heavily influenced by confounding factors is by looking at AWD or 4WD vs. 2WD versions of cars. They often have wildly different fatalty rates even though the safety differences are not very large (and the difference is often in favor of the 2WD vehicle). Some plausible causes of that are random noise, differences in who buys different versions of the same vehicle, and differences in how the vehicle are used.If we'd like to answer the question "which car makes or models are more or less safe", I don't find any of the aggregations that are publicly available to be satisfying and I think we need to look at the source data and do our own analysis to see if the data are consistent with what we see in crash test results.ConclusionWe looked at 12 different car makes and how they fared when the IIHS added small overlap tests. We saw that only Volvo was taking this kind of accident seriously before companies were publicly shamed for having poor small overlap safety by the IIHS even though small overlap crashes were known to be a significant source of fatalities at least since the 90s.Although I don't have the budget to do other tests, such as a rear crash test in a fully occupied vehicle, it appears plausible and perhaps even likely that most car makers that aren't Volvo would have mediocre or poor test scores if a testing agency decided to add another kind of crash test.Bonus: "real engineering" vs. programmingAs Hillel Wayne has noted, although programmers often have an idealized view of what "real engineers" do, when you compare what "real engineers" do with what programmers do, it's frequently not all that different. In particular, a common lament of programmers is that we're not held liable for our mistakes or poor designs, even in cases where that costs lives.Although automotive companies can, in some cases, be held liable for unsafe designs, just optimizing for a small set of benchmarks, which must've resulted in extra deaths over optimizing for safety instead of benchmark scores, isn't something that engineers or corporations were, in general, held liable for.Bonus: reputationIf I look at what people in my extended social circles think about vehicle safety, Tesla has the best reputation by far. If you look at broad-based consumer polls, that's a different story, and Volvo usually wins there, with other manufacturers fighting for a distant second.I find the Tesla thing interesting since their responses are basically the opposite of what you'd expect from a company that was serious about safety. When serious problems have occurred (with respect to safety or otherwise), they often have a very quick response that's basically "everything is fine". I would expect an organization that's serious about safety or improvement to respond with "we're investigating", followed by a detailed postmortem explaining what went wrong, but that doesn't appear to be Tesla's style.For example, on the driver-side small overlap test, Tesla had one model with a relevant score and it scored Acceptable (below Good, but above Poor and Marginal) even after modifications were made to improve the score. Tesla disputed the results, saying they make "the safest cars in history" and implying that IIHS should be ignored in favor of NHSTA test scores:While IIHS and dozens of other private industry groups around the world have methods and motivations that suit their own subjective purposes, the most objective and accurate independent testing of vehicle safety is currently done by the U.S. Government which found Model S and Model X to be the two cars with the lowest probability of injury of any cars that it has ever tested, making them the safest cars in history.As we've seen, Tesla isn't unusual for optimizing for a specific set of crash tests and achieving a mediocre score when an unexpected type of crash occurs, but their response is unusual. However, it makes sense from a cynical PR perspective. As we've seen over the past few years, loudly proclaiming something, regardless of whether or not it's true, even when there's incontrovertible evidence that it's untrue, seems to not only work, that kind of bombastic rhetoric appears to attract superfans who will aggressively defend the brand. If you watch car reviewers on youtube, they'll sometimes mention that they get hate mail for reviewing Teslas just like they review any other car and that they don't see anything like it for any other make.Apple also used this playbook to good effect in the 90s and early '00s, when they were rapidly falling behind in performance and responded not by improving performance, but by running a series of ad campaigns saying that had the best performance in the world and that they were shipping "supercomputers" on the desktop.Another reputational quirk is that I know a decent number of people who believe that the safest cars they can buy are "American Cars from the 60's and 70's that aren't made of plastic". We don't have directly relevant small overlap crash test scores for old cars, but the test data we do have on old cars indicates that they fare extremely poorly in overall safety compared to modern cars. For a visually dramatic example, see this crash test of a 1959 Chevrolet Bel Air vs. a 2009 Chevrolet Malibu.Appendix: methodology summaryThe top-line results section uses scores for the small overlap test both because it's the one where I think it's the most difficult to justify skimping on safety as measured by the test and it's also been around for long enough that we can see the impact of modifications to existing models and changes to subsequent models, which isn't true of the passenger side small overlap test (where many models are still untested).For the passenger side small overlap test, someone might argue that the driver side is more important because you virtually always have a driver in a car accident and may or may not have a front passenger. Also, for small overlap collisions (which simulates a head-to-head collision where the vehicles only overlap by 25%), driver's side collisions are more likely than passenger side collisions.Except to check Volvo's scores, I didn't look at roof crash test scores (which were added in 2009). I'm not going to describe the roof test in detail, but for the roof test, someone might argue that the roof test score should be used in conjunction with scoring the car for rollover probability since the roof test just tests roof strength, which is only relevant when a car has rolled over. I think, given what the data show, this objection doesn't hold in many cases (the vehicles with the worst roof test scores are often vehicles that have relatively high rollover rates), but it does in some cases, which would complicate the analysis.In most cases, we only get one reported test result for a model. However, there can be multiple versions of a model -- including before and after making safety changes intended to improve the test score. If changes were made to the model to improve safety, the test score is usually from after the changes were made and we usually don't get to see the score from before the model was changed. However, there are many exceptions to this, which are noted in the detailed results section.For this post, scores only count if the model was introduced before or near when the new test was introduced, since models introduced later could have design changes that optimize for the test.Appendix: detailed resultsOn each test, IIHS gives an overall rating (from worst to best) of Poor, Marginal, Acceptable, or Good. The tests have sub-scores, but we're not going to use those for this analysis. In each sub-section, we'll look at how many models got each score when the small overlap tests were added.VolvoAll Volvo models examined scored Good (the highest possible score) on the new tests when they were added (roof, driver-side small overlap, and passenger-side small overlap). One model, the 2008-2017 XC60, had a change made to trigger its side curtain airbag during a small overlap collision in 2013. Other models were tested without modifications.MercedesOf three pre-existing models with test results for driver-side small overlap, one scored Marginal without modifications and two scored Good after structural modifications. The model where we only have unmodified test scores (Mercedes C-Class) was fully re-designed after 2014, shortly after the driver-side small overlap test was introduced.As mentioned above, we often only get to see public results for models without modifications to improve results xor with modifications to improve results, so, for the models that scored Good, we don't actually know how they would've scored if you bought a vehicle before Mercedes updated the design, but the Marginal score from the one unmodified model we have is a negative signal.Also, when the passenger side small overlap test was added, the Mercedes vehicles also generally scored Good. This is, indicating that Mercedes didn't only increase protection on the driver's side in order to improve test scores.BMWOf the two models where we have relevant test scores, both scored Marginal before modifications. In one of the cases, there's also a score after structural changes were made in the 2017 model (recall that the driver-side small overlap test was introduced in 2012) and the model scored Good afterwards. The other model was fully-redesigned after 2016.For the five models where we have relevant passenger-side small overlap scores, all scored Good, indicating that the changes made to improve driver-side small overlap test scores weren't only made on the driver's side.HondaOf the five Honda models where we have relevant driver-side small overlap test scores, two scored Good, one scored Marginal, and two scored Poor. The model that scored Marginal had structural changes plus a seatbelt change in 2015 that changed its score to Good, other models weren't updated or don't have updated IIHS scores.Of the six Honda models where we have passenger driver-side small overlap test scores, two scored Good without modifications, two scored Acceptable without modifications, and one scored Good with modifications to the bumper.All of those models scored Good on the driver side small overlap test, indicating that when Honda increased the safety on the driver's side to score Good on the driver's side test, they didn't apply the same changes to the passenger side.ToyotaOf the six Toyota models where we have relevant driver-side small overlap test scores for unmodified models, one score Acceptable, four scored Marginal, and one scored Poor.The model that scored Acceptable had structural changes made to improve its score to Good, but on the driver's side only. The model was later tested in the passenger-side small overlap test and scored Acceptable. Of the four models that scored Marginal, one had structural modifications made in 2017 that improved its score to Good and another had airbag and seatbelt changes that improved its score to to Acceptable. The vehicle that scored Poor had structural changes made that improved its score to acceptable in 2014, followed by later changes that improved its score to Good.There are four additional models where we only have scores from after modifications were made. Of those, one scored Good, one score Acceptable, one scored Marginal, and one scored Poor.In general, changes appear to have been made to the driver's side only and, on introduction of the passenger side small overlap test, vehicles had passenger side small overlap scores that were the same as the driver's side score before modifications.FordOf the two models with relevant driver-side small overlap test scores for unmodified models, one scored Marginal and one scored Poor. Both of those models were produced into 2019 and neither has an updated test result. Of the three models where we have relevant results for modified vehicles, two scored Acceptable and one score Marginal. Also, one model was released the year the small overlap test was introduced and one the year after; both of those scored Acceptable. It's unclear if those should be considered modified or not since the design may have had last-minute changes before release.We only have three relevant passenger-side small overlap tests. One is Good (for a model released in 2015) and the other two are Poor; these are the two models mentioned above as having scored Marginal and Poor, respectively, on the driver-side small overlap test. It appears that the models continued to be produced into 2019 without safety changes. Both of these unmodified models were trucks and this isn't very unusual for a truck and is one of a number of reasons that fatality rates are generally higher in trucks -- until recently, many of them are based on old platforms that hadn't been updated for a long time.ChevroletOf the three Chevrolet models where we have relevant driver-side small overlap test scores before modifications, one scored Acceptable and two scored Marginal. One of the Marginal models had structural changes plus a change that caused side curtain airbags to deploy sooner in 2015, which improved its score to Good.Of the four Chevrolet models where we only have relevant driver-side small overlap test scores after the model was modified (all had structural modifications), two scored Good and two scored Acceptable.We only have one relevant score for the passenger-side small overlap test, that score is Marginal. That's on the model that was modified to improve its driver-side small overlap test score from Marginal to Good, indicating that the changes were made to improve the driver-side test score and not to improve passenger safety.SubaruWe don't have any models where we have relevant passenger-side small overlap test scores for models before they were modified.One model had a change to cause its airbag to deploy during small overlap tests; it scored Acceptable. Two models had some kind of structural changes, one of which scored Good and one of which score Acceptable.The model that had airbag changes had structural changes made in 2015 that improved its score from Acceptable to Good.For the one model where we have relevant passenger-side small overlap test scores, the score was Marginal. Also, for one of the models with structural changes, it was indicated that, among the changes, were changes to the left part of the firewall, indicating that changes were made to improve the driver's side test score without improving safety for a passenger on a passenger-side small overlap crash.TeslaThere's only one model with relevant results for the driver-side small overlap test. That model scored Acceptable before and after modifications were made to improve test scores.HyundaiOf the five vehicles where we have relevant driver-side small overlap test scores, one scored Acceptable, three scored Marginal, and one scored Poor. We don't have any indication that models were modified to improve their test scores.Of the two vehicles where we have relevant passenger-side small overlap test scores for unmodified models, one scored Good and one scored Acceptable.We also have one score for a model that had structural modifications to score Acceptable, which later had further modifications that allowed it to score Good. That model was introduced in 2017 and had a Good score on the driver-side small overlap test without modifications, indicating that it was designed to achieve a good test score on the driver's side test without similar consideration for a passenger-side impact.DodgeOf the five models where we have relevant driver-side small overlap test scores for unmodified models, two scored Acceptable, one scored Marginal, and two scored Poor. There are also two models where we have test scores after structural changes were made for safety in 2015; both of those models scored Marginal.We don't have relevant passenger-side small overlap test scores for any model, but even if we did, the dismal scores on the modified models means that we might not be able to tell if similar changes were made to the passenger side.NissanOf the seven models where we have relevant driver-side small overlap test scores for unmodified models, two scored Acceptable and five scored Poor.We have one model that only has test scores for a modified model; the frontal airbags and seatbelts were modified in 2013 and the side curtain airbags were modified in 2017. The score afterward modifications was Marginal.One of the models that scored Poor had structural changes made in 2015 that improved its score to Good.Of the four models where we have relevant passenger-side small overlap test scores, two scored Good, one scored Acceptable (that model scored good on the driver-side test), and one score Marginal (that model also scored Marginal on the driver-side test).JeepOf the two models where we have relevant driver-side small overlap test scores for unmodified models, one scored Marginal and one scored Poor.There's one model where we only have test score after modifications; that model has changes to its airbags and seatbelts and it scored Marginal after the changes. This model was also later tested on the passenger-side small overlap test and scored Poor.One other model has a relevant passenger-side small overlap test score; it scored Good.VolkswagenThe two models where we have relevant driver-side small overlap test scores for unmodified models both scored Marginal.Of the two models where we only have scores after modifications, one was modified 2013 and scored Marginal after modifications. It was then modified again in 2015 and scored Good after modifications. That model was later tested on the passenger side small-overlap test, where it scored Acceptable, indicating that the modifications differentially favored the driver's side. The other scored Acceptable after changes made in 2015 and then scored Good after further changes made in 2016. The 2016 model was later tested on the passenger-side small overlap test and scored Marginal, once again indicating that changes differentially favored the driver's side.We have passenger-side small overlap test for two other models, both of which scored Acceptable. These were models introduced in 2015 (well after the introduction of the driver-side small overlap test) and scored Good on the driver-side small overlap test.Appendix: miscellaniaA number of name brand car makes weren't included. Some because they have relatively low sales in the U.S. are low and/or declining rapidly (Mitsubishi, Fiat, Alfa Romeo, etc.), some because there's very high overlap in what vehicles are tested (Kia, Mazda, Audi), and some because there aren't relevant models with driver-side small overlap test scores (Lexus). When a corporation owns an umbrella of makes, like FCA with Jeep, Dodge, Chrysler, Ram, etc., these weren't pooled since most people who aren't car nerds aren't going to recognize FCA, but may recognize Jeep, Dodge, and Chrysler.If the terms of service of the API allowed you to use IIHS data however you wanted, I would've included smaller makes, but since the API comes with very restrictive terms on how you can display or discuss the data which aren't compatible with exploratory data analysis and I couldn't know how I would want to display or discuss the data before looking at the data, I pulled all of these results by hand (and didn't click through any EULAs, etc.), which was fairly time consuming, so there was a trade-off between more comprehensive coverage and the rest of my life.Appendix: what car should I buy?That depends on what you're looking for, there's no way to make a blanket recommendation. For practical information about particular vehicles, Alex on Autos is the best source that I know of. I don't generally like videos as a source of practical information, but car magazines tend to be much less informative than youtube car reviewers. There are car reviewers that are much more popular, but their popularity appears to come from having witty banter between charismatic co-hosts or other things that not only aren't directly related to providing information, they actually detract from providing information. If you just want to know about how cars work, Engineering Explained is also quite good, but the information there is generally practical.For reliability information, Consumer Reports is probably your best bet (you can also look at J.D. Power, but the way they aggregate information makes it much less useful to consumers).Thanks to Leah Hanson, Travis Downs, Prabin Paudel, and Justin Blank for comments/corrections/discussionthis includes the 2004-2012 Volvo S40/V50, 2006-2013 Volvo C70, and 2007-2013 Volvo C30, which were designed during the period when Ford owned Volvo. Although the C1 platform was a joint venture between Ford, Volvo, and Mazda engineers, the work was done under a Ford VP at a Ford facility. [return]to be fair, as we saw with the IIHS small overlap tests, not every manufacturer did terribly. In 2017 and 2018, 8 vehicles sold in Africa were crash tested. One got what we would consider a mediocre to bad score in the U.S. or Europe, five got what we would consider to be a bad score, and "only" three got what we would consider to be an atrocious score. The Nissan NP300, Datsun Go, and Cherry QQ3 were the three vehicles that scored the worst. Datsun is a sub-brand of Nissan and Cherry is a Chinese brand, also known as Qirui.We see the same thing if we look at cars sold in India. Recently, some tests have been run on cars sent to the Indian market and a number of vehicles from Datsun, Renault, Chevrolet, Tata, Honda, Hyundai, Suzuki, Mahindra, and Volkswagen came in with atrocious scores that would be considered impossibly bad in the U.S. or Europe. [return]
A lot of people seem to think that distributed tracing isn't useful, or at least not without extreme effort that isn't worth it for companies smaller than FB. For example, here are a couple of public conversations that sound like a number of private conversations I've had. Sure, there's value somewhere, but it costs too much to unlock.I think this overestimates how much work it is to get a lot of value from tracing. At Twitter, Rebecca Isaacs was able to lay out a vision for how to get value from tracing and executed on it (with help from a number other folks, including Jonathan Simms, Yuri Vishnevsky, Ruben Oanta, Dave Rusek, Hamdi Allam, and many others1) such that the work easily paid for itself. This post is going to describe the tracing "infrastructure" we've built and describe some use cases where we've found it to be valuable. Before we get to that, let's start with some background about the situation before Rebecca's vision came to fruition.At a high level, we could say that we had a trace-view oriented system and ran into all of the issues that one might expect from that. Those issues are discussed in more detail in this article by Cindy Sridharan. However, I'd like to discuss the particular issues we had in more detail since I think it's useful to look at what specific things were causing problems.Taken together, the issues were problematic enough that tracing was underowned and arguably unowned for years. Some individuals did work in their spare time to keep the lights on or improve things, but the lack of obvious value from tracing led to a vicious cycle where the high barrier to getting value out of tracing made it hard to fund organizationally, which made it hard to make tracing more usable.Some of the issues that made tracing low ROI included:Schema made it impossible to run simple queries "in place"No real way to aggregate infoNo way to find interesting or representative tracesImpossible to know actual sampling rate, sampling highly non-representativeTimeSchemaThe schema was effectively a set of traces, where each trace was a set of spans and each span was a set of annotations. Each span that wasn't a root span had a pointer to its parent, so that the graph structure of a trace could be determined.For the purposes of this post, we can think of each trace as either an external request including all sub-RPCs or a subset of a request, rooted downstream instead of at the top of the request. We also trace some things that aren't requests, like builds and git operations, but for simplicity we're going to ignore those for this post even though the techniques we'll discuss also apply to those.Each span corresponds to an RPC and each annotation is data that a developer chose to record on a span (e.g., the size of the RPC payload, queue depth of various queues in the system at the time of the span, or GC pause time for GC pauses that interrupted the RPC).Some issues that came out of having a schema that was a set of sets (of bags) included:Executing any query that used information about the graph structure inherent in a trace required reading every span in the trace and reconstructing the graphBecause there was no index or summary information of per-trace information, any query on a trace required reading every span in a tracePractically speaking, because the two items above are too expensive to do at query time in an ad hoc fashion, the only query people ran was some variant of "give me a few spans matching a simple filter"AggregationUntil about a year and a half ago, the only supported way to look at traces was to go to the UI, filter by a service name from a combination search box + dropdown, and then look at a list of recent traces, where you could click on any trace to get a "trace view". Each search returned the N most recent results, which wouldn't necessarily be representative of all recent results (for reasons mentioned below in the Sampling section), let alone representative of all results over any other time span.Per the problems discussed above in the schema section, since it was too expensive to run queries across a non-trivial number of traces, it was impossible to ask questions like "are any of the traces I'm looking at representative of common traces or am I looking at weird edge cases?" or "show me traces of specific tail events, e.g., when a request from service A to service B times out or when write amplification from service A to some backing database is > 3x", or even "only show me complete traces, i.e., traces where we haven't dropped spans from the trace".Also, if you clicked on a trace that was "too large", the query would time out and you wouldn't be able to view the trace -- this was another common side effect of the lack of any kind of rate limiting logic plus the schema.SamplingThere were multiple places where a decision was made to sample or not. There was no document that listed all of these places, making it impossible to even guess at the sampling rate without auditing all code to figure out where sampling decisions were being made.Moreover, there were multiple places where an unintentional sampling decision would be made due to the implementation. Spans were sent from services that had tracing enabled to a local agent, then to a "collector" service, and then from the collector service to our backing DB. Spans could be dropped at of these points: in the local agent; in the collector, which would have nodes fall over and lose all of their data regularly; and at the backing DB, which would reject writes due to hot keys or high load in general.This design where the trace id is the database key, with no intervening logic to pace out writes, meant that a 1M span trace (which we have) would cause 1M writes to the same key over a period of a few seconds. Another problem would be requests with a fanout of thousands (which exists at every tech company I've worked for), which could cause thousands writes with the same key over a period of a few milliseconds.Another sampling quirk was that, in order to avoid missing traces that didn't start at our internal front end, there was logic that caused an independent sampling decision in every RPC. If you do the math on this, if you have a service-oriented architecture like ours and you sample at what naively might sound like a moderately low rate, like, you'll end up with the vast majority of your spans starting at a leaf RPC, resulting in a single span trace. Of the non-leaf RPCs, the vast majority will start at the 2nd level from the leaf, and so on. The vast majority of our load and our storage costs were from these virtually useless traces that started at or near a leaf, and if you wanted to do any kind of analysis across spans to understand the behavior of the entire system, you'd have to account for this sampling bias on top of accounting for all of the other independent sampling decisions.TimeThere wasn't really any kind of adjustment for clock skew (there was something, but it attempted to do a local pairwise adjustment, which didn't really improve things and actually made it more difficult to reasonably account for clock skew).If you just naively computed how long a span took, even using timestamps from a single host, which removes many sources of possible clock skew, you'd get a lot of negative duration spans, which is of course impossible because a result can't get returned before the request for the result is created. And if you compared times across different hosts, the results were even worse.SolutionsThe solutions to these problems fall into what I think of as two buckets. For problems like dropped spans due to collector nodes falling over or the backing DB dropping requests, there's some straightforward engineering solution using well understood and widely used techniques. For that particular pair of problems, the short term bandaid was to do some GC tuning that reduced the rate of collector nodes falling over by about a factor of 100. That took all of two minutes, and then we replaced the collector nodes with a real queue that could absorb larger bursts in traffic and pace out writes to the DB. For the issue where we oversampled leaf-level spans due to rolling the sampling dice on every RPC, that's one of these little questions that most people would get right in an interview that can sometimes get lost as part of a larger system that has a number of solutions, e.g., since each span has a parent pointer, we must be able to know if an RPC has a parent or not in a relevant place and we can make a sampling decision and create a traceid iff a span has no parent pointer, which results in a uniform probability of each span being sampled, with each sampled trace being a complete trace.The other bucket is building up datasets and tools (and adding annotations) that allow users to answer questions they might have. This isn't a new idea, section 5 of the Dapper paper discussed this and it was published in 2010.Of course, one major difference is that Google has probably put at least two orders of magnitude more effort into building tools on top of Dapper than we've put into building tools on top of our tracing infra, so a lot of our tooling is much rougher, e.g., figure 6 from the Dapper paper shows a trace view that displays a set of relevant histograms, which makes it easy to understand the context of a trace. We haven't done the UI work for that yet, so the analogous view requires running a simple SQL query. While that's not hard, presenting the user with the data would be a better user experience than making the user query for the data.Of the work that's been done, the simplest obviously high ROI thing we've done is build a set of tables that contain information people might want to query, structured such that common queries that don't inherently have to do a lot of work don't have to do a lot of work.We have, partitioned by day, the following tables:trace_indexhigh-level trace-level information, e.g., does the trace have a root; what is the root; if relevant, what request endpoint was hit, etc.span_indexinformation on the client and serveranno_index"standard" annotations that people often want to query, e.g., request and response payload sizes, client/server send/recv timestamps, etc.span_metricscomputed metrics, e.g., span durationsflat_annotationAll annotations, in case you want to query something not in anno_indextrace_graphFor each trace, contains a graph representation of the trace, for use with queries that need the graph structureJust having this set of tables, queryable with SQL queries (or a Scalding or Spark job in cases where Presto SQL isn't ideal, like when doing some graph queries) is enough for tracing to pay for itself, to go from being difficult to justify to being something that's obviously high value.Some of the questions we've been to answer with this set of tables includes:For this service that's having problems, give me a representative set of tracesFor this service that has elevated load, show me which upstream service is causing the loadGive me the list of all services that have unusual write amplification to downstream service XIs traffic from a particular service or for a particular endpoint causing unusual write amplification? For example, in some cases, we see nothing unusual about the total write amplification from B -> C, but we see very high amplification from B -> C when B is called by A.Show me how much time we spend on serdes vs. "actual work" for various requestsShow me how much different kinds of requests cost in terms of backend workFor requests that have high latency, as determined by mobile client instrumentation, show me what happened on the backendShow me the set of latency critical paths for this request endpoint (with the annotations we currently have, this has a number issues that probably deserve their own post)Show me the CDF of services that this service depends onThis is a distribution because whether or not a particular service calls another service is data dependent; it's not uncommon to have a service that will only call another one every 1000 calls (on average)We have built and are building other tooling, but just being able to run queries and aggregations against trace data, both recent and historical, easily pays for all of the other work we'd like to do. This analogous to what we saw when we looked at metrics data, taking data we already had and exposing it in a way that lets people run arbitrary queries immediately paid dividends. Doing that for tracing is less straightforward than doing that for metrics because the data is richer, but it's a not fundamentally different idea.I think that having something to look at other than the raw data is also more important for tracing than it is for metrics since the metrics equivalent of a raw "trace view" of traces, a "dashboard view" of metrics where you just look at graphs, is obviously and intuitively useful. If that's all you have for metrics, people aren't going to say that it's not worth funding your metrics infra because dashboards are really useful! However, it's a lot harder to see how to get value out of a raw view of traces, which is where a lot of the comments about tracing not being valuable come from. This difference between the complexity of metrics data and tracing data makes the value add for higher-level views of tracing larger than it is for metrics.Having our data in a format that's not just blobs in a NoSQL DB has also allowed us to more easily build tooling on top of trace data that lets users who don't want to run SQL queries get value out of our trace data. An example of this is the Service Dependency Explorer (SDE), which was primarily built by Yuri Vishnevsky, Rebecca Isaacs, and Jonathan Simms, with help from Yihong Chen. If we try to look at the RPC call graph for a single request, we get something that's pretty large. In some cases, the depth of the call tree can be hundreds of levels deep and it's also not uncommon to see a fanout of 20 or more at some levels, which makes a naive visualization difficult to interpret.In order to see how SDE works, let's look at a smaller example where it's relatively easy to understand what's going on. Imagine we have 8 services, A through H and they call each other as shown in the tree below, we we have service A called 10 times, which calls service B a total of 10 times, which calls D, D, and E 50, 20, and 10 times respectively, where the two Ds are distinguished by being different RPC endpoints (calls) even though they're the same service, and so on, shown below:If we look at SDE from the standpoint of node E, we'll see the following:We can see the direct callers and callees, 100% of calls of E are from C, and 100% of calls of E also call C and that we have 20x load amplification when calling C (200/10 = 20), the same as we see if we look at the RPC tree above. If we look at indirect callees, we can see that D has a 4x load amplification (40 / 10 = 4).If we want to see what's directly called by C downstream of E, we can select it and we'll get arrows to the direct descendants of C, which in this case is every indirect callee of E.For a more complicated example, we can look at service D, which shows up in orange in our original tree, above.In this case, our summary box reads:On May 28, 2020 there were...10 total TFE-rooted traces110 total traced RPCs to D2.1 thousand total traced RPCs caused by D3 unique call paths from TFE endpoints to D endpointsThe fact that we see D three times in the tree is indicated in the summary box, where it says we have 3 unique call paths from our front end, TFE to D.We can expand out the calls to D and, in this case, see both of the calls and what fraction of traffic is to each call.If we click on one of the calls, we can see which nodes are upstream and downstream dependencies of a particular call, call4 is shown below and we can see that it never hits services C, H, and G downstream even though service D does for call3. Similarly, we can see that its upstream dependencies consist of being called directly by C, and indirectly by B and E but not A and C:Some things we can easily see from SDE are:What load a service or RPC call causesWhere we have unusual load amplification, whether that's generally true for a service or if it only occurs on some call pathsWhat causes load to a service or RPC callWhere and why we get cycles (very common for Strato, among other thingsWhat's causing weird super deep tracesThese are all things a user could get out of queries to the data we store, but having a tool with a UI that lets you click around in real time to explore things lowers the barrier to finding these things out.In the example shown above, there are a small number of services, so you could get similar information out of the more commonly used sea of nodes view, where each node is a service, with some annotations on the visualization, but when we've looked at real traces, showing thousands of services and a global makes it very difficult to see what's going on. Some of Rebecca's early analyses used a view like that, but we've found that you need to have a lot of implicit knowledge to make good use of a view like that, a view that discards a lot more information and highlights a few things makes it easier to users who don't happen to have the right implicit knowledge to get value out of looking at traces.Although we've demo'd a view of RPC count / load here, we could also display other things, like latency, errors, payload sizes, etc.ConclusionMore generally, this is just a brief description of a few of the things we've built on top of the data you get if you have basic distributed tracing set up. You probably don't want to do exactly what we've done since you probably have somewhat different problems and you're very unlikely to encounter the exact set of problems that our tracing infra had. From backchannel chatter with folks at other companies, I don't think the level of problems we had was unique; if anything, our tracing infra was in a better state than at many or most peer companies (which excludes behemoths like FB/Google/Amazon) since it basically worked and people could and did use the trace view we had to debug real production issues. But, as they say, unhappy systems are unhappy in their own way.Like our previous look at metrics analytics, this work was done incrementally. Since trace data is much richer than metrics data, a lot more time was spent doing ad hoc analyses of the data before writing the Scalding (MapReduce) jobs that produce the tables mentioned in this post, but the individual analyses were valuable enough that there wasn't really a time when this set of projects didn't pay for itself after the first few weeks it took to clean up some of the worst data quality issues and run an (extremely painful) ad hoc analysis with the existing infra.Looking back at discussions on whether or not it makes sense to work on tracing infra, people often point to the numerous failures at various companies to justify a buy (instead of build) decision. I don't think that's exactly unreasonable, the base rate of failure of similar projects shouldn't be ignored. But, on the other hand, most of the work described wasn't super tricky, beyond getting organizational buy-in and having a clear picture of the value that tracing can bring.One thing that's a bit beyond the scope of this post that probably deserves its own post is that, tracing and metrics, while not fully orthogonal, are complementary and having only one or the other leaves you blind to a lot of problems. You're going to pay a high cost for that in a variety of ways: unnecessary incidents, extra time spent debugging incidents, generally higher monetary costs due to running infra inefficiently, etc. Also, while metrics and tracing individually gives you much better visibility than having either alone, some problemls require looking at both together; some of the most interesting analyses I've done involve joining (often with a literal SQL join) trace data and metrics data.To make it concrete, an example of something that's easy to see with tracing but annoying to see with logging unless you add logging to try to find this in particular (which you can do for any individual case, but probably don't want to do for the thousands of things tracing makes visible), is something we looked at above: "show me cases where a specific call path from the load balancer to A causes high load amplification on some service B, which may be multiple hops away from A in the call graph. In some cases, this will be apparent because A generally causes high load amplificaiton on B, but if it only happens in some cases, that's still easy to handle with tracing but it's very annoying if you're just looking at metrics.An example of something where you want to join tracing and metrics data is when looking at the performance impact of something like a bad host on latency. You will, in general, not be able to annotate the appropriate spans that pass through the host as bad because, if you knew the host was bad at the time of the span, the host wouldn't be in production. But you can sometimes find, with historical data, a set of hosts that are bad, and then look up latency critical paths that pass through the host to determine the end-to-end impact of the bad host.Everyone has their own biases, with respect to tracing, mine come from generally working on things that try to direct improve cost, reliability, and latency, so the examples are focused on that, but there are also a lot of other uses for tracing. You can check out Distributed Tracing in Practice or Mastering Distributed Tracing for some other perspectives.AcknowledgementsThanks to Rebecca Isaacs, Leah Hanson, Yao Yue, and Yuri Vishnevsky for comments/corrections/discussion.this will almost certainly be an incomplete list, but some other people who've pitched in include Moses, Tiina, Rich, Rahul, Ben, Mike, Mary, Arash, Feng, Jenny, Andy, Yao, Yihong, Vinu, and myself.Note that this relatively long list of contributors doesn't contradict this work being high ROI. I'd estimate that there's been less than 2 person-years worth of work on everything discussed in this post. Just for example, while I spend a fair amount of time doing analyses that use the tracing infra, I think I've only spent on the order of one week on the infra itself.In case it's not obvious from the above, even though I'm writing this up, I was a pretty minor contributor to this. I'm just writing it up because I sat next to Rebecca as this work was being done and was super impressed by both her process and the outcome. [return]
We spent one day1 building a system that immediately found a mid 7 figure optimization (which ended up shipping). In the first year, we shipped mid 8 figures per year worth of cost savings as a result. The key feature this system introduces is the ability to query metrics data across all hosts and all services and over any period of time (since inception), so we've called it LongTermMetrics (LTM) internally since I like boring, descriptive, names.This got started when I was looking for a starter project that would both help me understand the Twitter infra stack and also have some easily quantifiable value. Andy Wilcox suggested looking at JVM survivor space utilization for some large services. If you're not familiar with what survivor space is, you can think of it as a configurable, fixed-size buffer, in the JVM (at least if you use the GC algorithm that's default at Twitter). At the time, if you looked at a random large services, you'd usually find that either:The buffer was too small, resulting in poor performance, sometimes catastrophically poor when under high load.The buffer was too large, resulting in wasted memory, i.e., wasted money.But instead of looking at random services, there's no fundamental reason that we shouldn't be able to query all services and get a list of which services have room for improvement in their configuration, sorted by performance degradation or cost savings. And if we write that query for JVM survivor space, this also goes for other configuration parameters (e.g., other JVM parameters, CPU quota, memory quota, etc.). Writing a query that worked for all the services turned out to be a little more difficult than I was hoping due to a combination of data consistency and performance issues. Data consistency issues included things like:Any given metric can have ~100 names, e.g., I found 94 different names for JVM survivor spaceI suspect there are more, these were just the ones I could find via a simple searchThe same metric name might have a different meaning for different servicesCould be a counter or a gaugeCould have different units, e.g., bytes vs. MB or microseconds vs. millisecondsMetrics are sometimes tagged with an incorrect service nameZombie shards can continue to operate and report metrics even though the cluster manager has started up a new instance of the shard, resulting in duplicate and inconsistent metrics for a particular shard nameOur metrics database, MetricsDB, was specialized to handle monitoring, dashboards, alerts, etc. and didn't support general queries. That's totally reasonable, since monitoring and dashboards are lower on Maslow's hierarchy of observability needs than general metrics analytics. In backchannel discussions from folks at other companies, the entire set of systems around MetricsDB seems to have solved a lot of the problems that plauge people at other companies with similar scale, but the specialization meant that we couldn't run arbitrary SQL queries against metrics in MetricsDB.Another way to query the data is to use the copy that gets written to HDFS in Parquet format, which allows people to run arbitrary SQL queries (as well as write Scalding (MapReduce) jobs that consume the data).Unfortunately, due to the number of metric names, the data on HDFS can't be stored in a columnar format with one column per name -- Presto gets unhappy if you feed it too many columns and we have enough different metrics that we're well beyond that limit. If you don't use a columnar format (and don't apply any other tricks), you end up reading a lot of data for any non-trivial query. The result was that you couldn't run any non-trivial query (or even many trivial queries) across all services or all hosts without having it time out. We don't have similar timeouts for Scalding, but Scalding performance is much worse and a simple Scalding query against a day's worth of metrics will usually take between three and twenty hours, depending on cluster load, making it unreasonable to use Scalding for any kind of exploratory data analysis.Given the data infrastructure that already existed, an easy way to solve both of these problems was to write a Scalding job to store the 0.1% to 0.01% of metrics data that we care about for performance or capacity related queries and re-write it into a columnar format. I would guess that at least 90% of metrics are things that almost no one will want to look at in almost any circumstance, and of the metrics anyone really cares about, the vast majority aren't performance related. A happy side effect of this is that since such a small fraction of the data is relevant, it's cheap to store it indefinitely. The standard metrics data dump is deleted after a few weeks because it's large enough that it would be prohibitively expensive to store it indefinitely; a longer metrics memory will be useful for capacity planning or other analyses that prefer to have historical data.The data we're saving includes (but isn't limited to) the following things for each shard of each service:utilizations and sizes of various buffersCPU, memory, and other utilizationnumber of threads, context switches, core migrationsvarious queue depths and network statsJVM version, feature flags, etc.GC statsFinagle metricsAnd for each host:various things from procfs, like iowait time, idle, etc.what cluster the machine is a part ofhost-level info like NIC speed, number of cores on the host, memory,host-level stats for "health" issues like thermal throttling, machine checks, etc.OS version, host-level software versions, host-level feature flags, etc.Rezolus metricsFor things that we know change very infrequently (like host NIC speed), we store these daily, but most of these are stored at the same frequency and granularity that our other metrics is stored for. In some cases, this is obviously wasteful (e.g., for JVM tenuring threshold, which is typically identical across every shard of a service and rarely changes), but this was the easiest way to handle this given the infra we have around metrics.Although the impetus for this project was figuring out which services were under or over configured for JVM survivor space, it started with GC and container metrics since those were very obvious things to look at and we've been incrementally adding other metrics since then. To get an idea of the kinds of things we can query for and how simple queries are if you know a bit of SQL, here are some examples:Very High p90 JVM Survivor SpaceThis is part of the original goal of finding under/over-provisioned services. Any service with a very high p90 JVM survivor space utilization is probably under-provisioned on survivor space. Similarly, anything with a very low p99 or p999 JVM survivor space utilization when under peak load is probably overprovisioned (query not displayed here, but we can scope the query to times of high load).A Presto query for very high p90 survivor space across all services is:with results as ( select servicename, approx_distinct(source, 0.1) as approx_sources, -- number of shards for the service -- real query uses [coalesce and nullif](https://prestodb.io/docs/current/functions/conditional.html) to handle edge cases, omitted for brevity approx_percentile(jvmSurvivorUsed / jvmSurvivorMax, 0.90) as p90_used, approx_percentile(jvmSurvivorUsed / jvmSurvivorMax, 0.50) as p50_used, from ltm_service where ds >= '2020-02-01' and ds <= '2020-02-28' group by servicename)select * from resultswhere approx_sources > 100order by p90_used descRather than having to look through a bunch of dashboards, we can just get a list and then send diffs with config changes to the appropriate teams or write a script that takes the output of the query and automatically writes the diff. The above query provides a pattern for any basic utilization numbers or rates; you could look at memory usage, new or old gen GC frequency, etc., with similar queries. In one case, we found a service that was wasting enough RAM to pay my salary for a decade.I've been moving away from using thresholds against simple percentiles to find issues, but I'm presenting this query because this is a thing people commonly want to do that's useful and I can write this without having to spend a lot of space explain why it's a reasonable thing to do; what I prefer to do instead is out of scope of this post and probably deserves its own post.Network utilizationThe above query was over all services, but we can also query across hosts. In addition, we can do queries that join against properties of the host, feature flags, etc.Using one set of queries, we were able to determine that we had a significant number of services running up against network limits even though host-level network utilization was low. The compute platform team then did a gradual rollout of a change to network caps, which we monitored with queries like the one below to determine that we weren't see any performance degradation (theoretically possible if increasing network caps caused hosts or switches to hit network limits).With the network change, we were able to observe, smaller queue depths, smaller queue size (in bytes), fewer packet drops, etc.The query below only shows queue depths for brevity; adding all of the quantities mentioned is just a matter of typing more names in.The general thing we can do is, for any particular rollout of a platform or service-level feature, we can see the impact on real services.with rolled as ( select -- rollout was fixed for all hosts during the time period, can pick an arbitrary element from the time period arbitrary(element_at(misc, 'egress_rate_limit_increase')) as rollout, hostId from ltm_deploys where ds = '2019-10-10' and zone = 'foo' group by ipAddress), host_info as( select arbitrary(nicSpeed) as nicSpeed, hostId from ltm_host where ds = '2019-10-10' and zone = 'foo' group by ipAddress), host_rolled as ( select rollout, nicSpeed, rolled.hostId from rolled join host_info on rolled.ipAddress = host_info.ipAddress), container_metrics as ( select service, netTxQlen, hostId from ltm_container where ds >= '2019-10-10' and ds <= '2019-10-14' and zone = 'foo')select service, nicSpeed, approx_percentile(netTxQlen, 1, 0.999, 0.0001) as p999_qlen, approx_percentile(netTxQlen, 1, 0.99, 0.001) as p99_qlen, approx_percentile(netTxQlen, 0.9) as p90_qlen, approx_percentile(netTxQlen, 0.68) as p68_qlen, rollout, count(*) as cntfrom container_metricsjoin host_rolled on host_rolled.hostId = container_metrics.hostIdgroup by service, nicSpeed, rolloutOther questions that became easy to answerWhat's the latency, CPU usage, CPI, or other performance impact of X?Increasing or decreasing the number of performance counters we monitor per containerTweaking kernel parametersOS or other releasesIncreasing or decreasing host-level oversubscriptionGeneral host-level loadRetry budget exhaustionFor relevant items above, what's the distribution of X, in general or under certain circumstances?What hosts have unusually poor service-level performance for every service on the host, after controlling for load, etc.?This has usually turned out to be due to a hardware misconfiguration or faultWhich services don't play nicely with other services aside from the general impact on host-level load?What's the latency impact of failover, or other high-load events?What level of load should we expect in the future given a future high-load event plus current growth?Which services see more load during failover, which services see unchanged load, and which fall somewhere in between?What config changes can we make for any fixed sized buffer or allocation that will improve performance without increasing cost or reduce cost without degrading performance?For some particular host-level health problem, what's the probability it recurs if we see it N times?etc., there are a lot of questions that become easy to answer if you can write arbitrary queries against historical metrics dataDesign decisionsLTM is about as boring a system as is possible. Every design decision falls out of taking the path of least resistance.Why using Scalding?It's standard at Twitter and the integration made everything trivial. I tried Spark, which has some advantages. However, at the time, I would have had to do manual integration work that I got for free with Scalding.Why use Presto and not something that allows for live slice & dice queries like Druid?Rebecca Isaacs and Jonathan Simms were doing related work on tracing and we knew that we'd want to do joins between LTM and whatever they created. That's trivial with Presto but would have required more planning and work with something like Druid, at least at the time.George Sirois imported a subset of the data into Druid so we could play with it and the facilities it offers are very nice; it's probably worth re-visiting at some pointWhy not use Postgres or something similar?The amount of data we want to store makes this infeasible without a massive amount of effort; even though the cost of data storage is quite low, it's still a "big data" problemWhy Parquet instead of a more efficient format?It was the most suitable of the standard supported formats (the other major suppported format is raw thrift), introducing a new format would be a much larger project than this projectWhy is the system not real-time (with delays of at least one hour)?Twitter's batch job pipeline is easy to build on, all that was necessary was to read some tutorial on how it works and then write something similar, but with different business logic.There was a nicely written proposal to build a real-time analytics pipeline for metrics data written a couple years before I joined Twitter, but that never got built because (I estimate) it would have been one to four quarters of work to produce an MVP and it wasn't clear what team had the right mandate to work on that and also had 4 quarters of headcount available. But the add a batch job took one day, you don't need to have roadmap and planning meetings for a day of work, you can just do it and then do follow-on work incrementally.If we're looking for misconfigurations or optimization opportunities, these rarely go away within an hour (and if they did, they must've had small total impact) and, in fact, they often persist for months to years, so we don't lose much by givng up on real-time (we do lose the ability to use the output of this for some monitoring use cases)The real-time version would've been a system that significant operational cost can't be operated by one person without undue burden. This system has more operational/maintenance burden than I'd like, probably 1-2 days of mine time per month a month on average, which at this point makes that a pretty large fraction of the total cost of the system, but it never pages, and the amount of work can easily be handeled by one person.Boring technologyI think writing about systems like this, that are just boring work is really underrated. A disproportionate number of posts and talks I read are about systems using hot technologies. I don't have anything against hot new technologies, but a lot of useful work comes from plugging boring technologies together and doing the obvious thing. Since posts and talks about boring work are relatively rare, I think writing up something like this is more useful than it has any right to be.For example, a couple years ago, at a local meetup that Matt Singer organizes for companies in our size class to discuss infrastructure (basically, companies that are smaller than FB/Amazon/Google) I asked if anyone was doing something similar to what we'd just done. No one who was there was (or not who'd admit to it, anyway), and engineers from two different companies expressed shock that we could store so much data, and not just the average per time period, but some histogram information as well. This work is too straightforward and obvious to be novel, I'm sure people have built analogous systems in many places. It's literally just storing metrics data on HDFS (or, if you prefer a more general term, a data lake) indefinitely in a format that allows interactive queries.If you do the math on the cost of metrics data storage for a project like this in a company in our size class, the storage cost is basically a rounding error. We've shipped individual diffs that easily pay for the storage cost for decades. I don't think there's any reason storing a few years or even a decade worth of metrics should be shocking when people deploy analytics and observability tools that cost much more all the time. But it turns out this was surprising, in part because people don't write up work this boring.An unrelated example is that, a while back, I ran into someone at a similarly sized company who wanted to get similar insights out of their metrics data. Instead of starting with something that would take a day, like this project, they started with deep learning. While I think there's value in applying ML and/or stats to infra metrics, they turned a project that could return significant value to the company after a couple of person-days into a project that took person-years. And if you're only going to either apply simple heuristics guided by someone with infra experience and simple statistical models or naively apply deep learning, I think the former has much higher ROI. Applying both sophisticated stats/ML and practitioner guided heuristics together can get you better results than either alone, but I think it makes a lot more sense to start with the simple project that takes a day to build out and maybe another day or two to start to apply than to start with a project that takes months or years to build out and start to apply. But there are a lot of biases towards doing the larger project: it makes a better resume item (deep learning!), in many places, it makes a better promo case, and people are more likely to give a talk or write up a blog post on the cool system that uses deep learning.The above discusses why writing up work is valuable for the industry in general. We covered why writing up work is valuable to the company doing the write-up in a previous post, so I'm not going to re-hash that here.Appendix: stuff I screwed upI think it's unfortunate that you don't get to hear about the downsides of systems without backchannel chatter, so here are things I did that are pretty obvious mistakes in retrospect. I'll add to this when something else becomes obvious in retrospect.Not using a double for almost everythingIn an ideal world, some things aren't doubles, but everything in our metrics stack goes through a stage where basically every metric is converted to a doubleI stored most things that "should" be an integral type as an integral type, but doing the conversion from long -> double -> long is never going to be more precise than just doing thelong -> double conversion and it opens the door to other problemsI stored some things that shouldn't be an integral type as an integral type, which causes small values to unnecessarily lose precisionLuckily this hasn't caused serious errors for any actionable analysis I've done, but there are analyses where it could cause problemsUsing asserts instead of writing bad entries out to some kind of "bad entries" tableFor reasons that are out of scope of this post, there isn't really a reasonable way to log errors or warnings in Scalding jobs, so I used asserts to catch things that shoudn't happen, which causes the entire job to die every time something unexpected happens; a better solution would be to write bad input entries out into a table and then have that table emailed out as a soft alert if the table isn't emptyAn example of a case where this would've saved some operational overhead is where we had an unusual amount of clock skew (3600 years), which caused a timestamp overflow. If I had a table that was a log of bad entries, the bad entry would've been omitted from the output, which is the correct behavior, and it would've saved an interruption plus having to push a fix and re-deploy the job.Longterm vs. LongTerm in the codeI wasn't sure which way this should be capitalized when I was first writing this and, when I made a decision, I failed to grep for and squash everything that was written the wrong way, so now this pointless inconsistency exists in various placesThese are the kind of thing you expect when you crank out something quickly and don't think it through enough. The last item is trivial to fix and not much of a problem since the ubiquitous use of IDEs at Twitter means that basically anyone who would be impacted will have their IDE supply the correct capitalization for them.The first item is more problematic, both in that it could actually cause incorrect analyses and in that fixing it will require doing a migration of all the data we have. My guess is that, at this point, this will be half a week to a week of work, which I could've easily avoided by spending thirty more seconds thinking through what I was doing.The second item is somewhere in between. Between the first and second items, I think I've probably signed up for roughly double the amount of direct work on this system (so, not including time spent on data analysis on data in the system, just the time spent to build the system) for essentially no benefit.Thanks to Leah Hanson, Andy Wilcox, Lifan Zeng, and Matej Stuchlik for comments/corrections/discussionThe actual work involved was about a day's work, but it was done over a week since I had to learn Scala as well as Scalding and the general Twitter stack, the metrics stack, etc.One day is also just an estimate for the work for the initial data sets. Since then, I've done probably a couple more weeks of work and Wesley Aptekar-Cassels and Kunal Trivedi have probably put in another week or two of time. The opertional cost is probably something like 1-2 days of my time per month (on average), bringing the total cost to on the order a month or two.I'm also not counting time spent using the dataset, or time spent debugging issues, which will include a lot of time that I can only roughly guess at, e.g., when the compute platform team changed the network egress limits as a result of some data analysis that took about an hour, that exposed a latent mesos bug that probably cost a day of Ilya Pronin's time, David Mackey has spent a fair amount of time tracking down weird issues where the data shows something odd is going on, but we don't know what is, etc. If you wanted to fully account for time spent on work that came out of some data analysis on the data sets discussed in the post, I suspect, between service-level teams, plus platform-level teams like our JVM, OS, and HW teams, we're probably at roughly 1 person-year of time.But, because the initial work it took to create a working and useful system was a day plus time spent working on orientation material and the system returned seven figures, it's been very easy to justify all of this additional time spent, which probably wouldn't have been the case if a year of up-front work was required. Most of the rest of the time isn't the kind of thing that's usually "charged" on roadmap reviews on creating a system (time spent by users, operational overhead), but perhaps the ongoing operational cost shlould be "charged" when creating the system (I don't think it makes sense to "charge" time spent by users to the system since, the more useful a system is, the more time users will spend using it, that doesn't really seem like a cost).There'a also been work to build tools on top of this, Kunal Trivedi has spent a fair amount of time building a layer on top of this to make the presentation more user friendly than SQL queries, which could arguably be charged to this project. [return]
How (some) good corporate engineering blogs are written
I've been comparing notes with people who run corporate engineering blogs and one thing that I think is curious is that it's pretty common for my personal blog to get more traffic than the entire corp eng blog for a company with a nine to ten figure valuation and it's not uncommon for my blog to get an order of magnitude more traffic.I think this is odd because tech companies in that class often have hundreds to thousands of employees. They're overwhelmingly likely to be better equipped to write a compelling blog than I am and companies get a lot more value from having a compelling blog than I do.With respect to the former, employees of the company will have done more interesting engineering work, have more fun stories, and have more in-depth knowledge than any one person who has a personal blog. On the latter, my blog helps me with job searching and it helps companies hire. But I only need one job, so more exposure, at best, gets me a slightly better job, whereas all but one tech company I've worked for is desperate to hire and loses candidates to other companies all the time. Moreover, I'm not really competing against other candidates when I interview (even if we interview for the same job, if the company likes more than one of us, it will usually just make more jobs). The high-order bit on this blog with respect to job searching is whether or not the process can take significant non-interview feedback or if I'll fail the interview because they do a conventional interview and the marginal value of an additional post is probably very low with respect to that. On the other hand, companies compete relatively directly when recruiting, so being more compelling relative to another company has value to them; replicating the playbook Cloudflare or Segment has used with their engineering "brands" would be a significant recruiting advantage. The playbook isn't secret: these companies broadcast their output to the world and are generally happy to talk about their blogging process.Despite the seemingly obvious benefits of having a "good" corp eng blog, most corp eng blogs are full of stuff engineers don't want to read. Vague, high-level fluff about how amazing everything is, content marketing, handwave-y posts about the new hotness (today, that might be using deep learning for inappropriate applications; ten years ago, that might have been using "big data" for inappropriate applications), etc.To try to understand what companies with good corporate engineering blog have in common, I interviewed folks at three different companies that have compelling corporate engineering blogs (Cloudflare, Heap, and Segment) as well as folks at three different companies that have lame corporate engineering blogs (which I'm not going to name).At a high level, the compelling engineering blogs had processes that shared the following properties:Easy approval process, not many approvals necessaryFew or no non-engineering approvals requiredImplicit or explicit fast SLO on approvalsApproval/editing process mainly makes posts more compelling to engineersDirect, high-level (co-founder, C-level, or VP-level) support for keeping blog process lightweightThe less compelling engineering blogs had processes that shared the following properties:Slow approval processMany approvals necessarySignificant non-engineering approvals necessaryNon-engineering approvals suggest changes authors find frustratingBack-and-forth can go on for monthsApproval/editing process mainly de-risks posts, removes references to specifics, makes posts vaguer and less interesting to engineersEffectively no high-level support for bloggingLeadership may agree that blogging is good in the abstract, but it's not a high enough priority to take concrete actionReforming process to make blogging easier very difficult; previous efforts have failedChanging process to reduce overhead requires all "stakeholders" to sign off (14 in one case)Any single stakeholder can blockNo single stakeholder can approveStakeholders wary of approving anything that reduces overheadApproving involves taking on perceived risk (what if something bad happens) with no perceived benefit to themOne person at a company with a compelling blog noted that a downside of having only one approver and/or one primary approver is that if that person is busy, it can takes weeks to get posts approved. That's fair, that's a downside of having centralized approval. However, when we compare to the alternative processes, at one company, people noted that it's typical for approvals to take three to six months and tail cases can take a year.While a few weeks can seem like a long time for someone used to a fast moving company, people at slower moving companies would be ecstatic to have an approval process that only takes twice that long.Here are the processes, as described to me, for the three companies I interviewed (presented in sha512sum order, which is coincidentally ordered by increasing size of company, from a couple hundred employees to nearly one thousand employees):HeapSomeone has an idea to write a postWriter (who is an engineer) is paired with a "buddy", who edits and then approves the postBuddy is an engineer who has a track record of producing reasonable writingThis may take a few rounds, may change thrust of the postCTO reads and approvesUsually only minor feedbackMay make suggestions like "a designer could make this graph look better"Publish postThe first editing phase used to involve posting a draft to a slack channel where "everyone" would comment on the post. This was an unpleasant experience since "everyone" would make comments and a lot of revision would be required. This process was designed to avoid getting "too much" feedback.SegmentSomeone has an idea to write a postOften comes from: internal docs, external talk, shipped project, open source tooling (built by Segment)Writer (who is an engineer) writes a draftMay have a senior eng work with them to write the draftUntil recently, no one really owned the feedback processCalvin French-Owen (co-founder) and Rick (engineering manager) would usually give most feedbackMaybe also get feedback from manager and eng leadershipTypically, 3rd draft is considered finishedNow, have a full-time editor who owns editing postsAlso socialize among eng team, get get feedback from 15-20 peoplePR and legal will take a look, lightweight approvalSome changes that have been made includeAt one point, when trying to establish an "engineering brand", making in-depth technical posts a top-level priorityhad a "blogging retreat", one week spent on writing a postadded writing and speaking as explicit criteria to be rewarded in performance reviews and career laddersAlthough there's legal and PR approval, Calvin noted "In general we try to keep it fairly lightweight. I see the bigger problem with blogging being a lack of posts or vague, high level content which isn't interesting rather than revealing too much."CloudflareSomeone has an idea to write a postInternal blogging is part of the culture, some posts come from the internal blogJohn Graham-Cumming (CTO) reads every post, other folks will read and commentJohn is approver for postsMatthew Prince (CEO) also generally supportive of blogging"Very quick" legal approval process, SLO of 1 hourThis process is so lightweight that one person didn't really think of it as an approval, another person didn't mention it at all (a third person did mention this step)Comms generally not involvedOne thing to note is that this only applies to technical blog posts. Product announcements have a heavier process because they're tied to sales material, press releases, etc.One thing I find interesting is that Marek interviewed at Cloudflare because of their blog (this 2013 blog post on their 4th generation servers caught his eye) and he's now both a key engineer for them as well as one of the main sources of compelling Cloudflare blog posts. At this point, the Cloudflare blog has generated at least a few more generations of folks who interviewed because they saw a blog post and now write compelling posts for the blog.General commentsMy opinion is that the natural state of a corp eng blog where people get a bit of feedback is a pretty interesting blog. There's a dearth of real, in-depth, technical writing, which makes any half decent, honest, public writing about technical work interesting.In order to have a boring blog, the corporation has to actively stop engineers from putting interesting content out there. Unfortunately, it appears that the natural state of large corporations tends towards risk aversion and blocking people from writing, just in case it causes a legal or PR or other problem. Individual contributors (ICs) might have the opinion that it's ridiculous to block engineers from writing low-risk technical posts while, simultaneously, C-level execs and VPs regularly make public comments that turn into PR disasters, but ICs in large companies don't have the authority or don't feel like they have the authority to do something just because it makes sense. And none of the fourteen stakeholders who'd have to sign off on approving a streamlined process care about streamlining the process since that would be good for the company in a way that doesn't really impact them, not when that would mean seemingly taking responsibility for the risk a streamlined process would add, however small. An exec or a senior VP willing to take a risk can take responsibility for the fallout and, if they're interested in engineering recruiting or morale, they may see a reason to do so.One comment I've often heard from people at more bureaucratic companies is something like "every company our size is like this", but that's not true. Cloudflare, a $6B company approaching 1k employees is in the same size class as many other companies with a much more onerous blogging process. The corp eng blog situation seems similar to situation on giving real interview feedback. interviewing.io claims that there's significant upside and very little downside to doing so. Some companies actually do give real feedback and the ones that do generally find that it gives them an easy advantage in recruiting with little downside, but the vast majority of companies don't do this and people at those companies will claim that it's impossible to do give feedback since you'll get sued or the company will be "cancelled" even though this generally doesn't happen to companies that give feedback and there are even entire industries where it's common to give interview feedback. It's easy to handwave that some risk exists and very few people have the authority to dismiss vague handwaving about risk when it's coming from multiple orgs.Although this is a small sample size and it's dangerous to generalize too much from small samples, the idea that you need high-level support to blast through bureaucracy is consistent with what I've seen in other areas where most large companies have a hard time doing something easy that has obvious but diffuse value. While this post happens to be about blogging, I've heard stories that are the same shape on a wide variety of topics.Appendix: examples of compelling blog postsHere are some blog posts from the blogs mentioned with a short comment on why I thought the post was compelling. This time, in reverse sha512 hash order.Cloudflarehttps://blog.cloudflare.com/how-verizon-and-a-bgp-optimizer-knocked-large-parts-of-the-internet-offline-today/Talks about a real technical problem that impacted a lot of people, reasonably in depthTimely, released only eight hours after the outage, when people were still really interested in hearing about what happened; most companies can't turn around a compelling blog post this quickly or can only do it on a special-case basis, Cloudflare is able to crank out timely posts semi-regularlyhttps://blog.cloudflare.com/the-relative-cost-of-bandwidth-around-the-world/Exploration of some datahttps://blog.cloudflare.com/the-story-of-one-latency-spike/A debugging storyhttps://blog.cloudflare.com/when-bloom-filters-dont-bloom/A debugging story, this time in the context of developing a data structureSegmenthttps://segment.com/blog/when-aws-autoscale-doesn-t/Concrete explanation of a gotcha in a widely used servicehttps://segment.com/blog/gotchas-from-two-years-of-node/Concrete example and explanation of a gotcha in a widely used toolhttps://segment.com/blog/automating-our-infrastructure/Post with specific details about how a company operates; in theory, any company could write this, but few doHeaphttps://heap.io/blog/engineering/basic-performance-analysis-saved-us-millionsTalks about a real problem and solutionhttps://heap.io/blog/engineering/clocksource-aws-ec2-vdsoTalks about a real problem and solutionIn HN comments, engineers (malisper, kalmar) have technical responses with real reasons in them and not just the usual dissembling that you see in most caseshttps://heap.io/blog/analysis/migrating-to-typescriptReal talk about how the first attempt at driving a company-wide technical change failedOne thing to note is that these blogs all have different styles. Personally, I prefer the style of Cloudflare's blog, which has a higher proportion of "deep dive" technical posts, but different people will prefer different styles. There are a lot of styles that can work.Thanks to Marek Majkowski, Kamal Marhubi, Calvin French-Owen, John Graham-Cunning, Michael Malis, Matthew Prince, Yuri Vishnevsky, Julia Evans, Wesley Aptekar-Cassels, Nathan Reed, Jake Seliger, an anonymous commenter, plus sources from the companies I didn't name for comments/corrections/discussion; none of the people explicitly mentioned in the acknowledgements were sources for information on the less compelling blogs
My hobby: opening up McIlroy s UNIX philosophy on one monitor while reading manpages on the other.The first of McIlroy's dicta is often paraphrased as "do one thing and do it well", which is shortened from "Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new 'features.'"McIlroy's example of this dictum is:Surprising to outsiders is the fact that UNIX compilers produce no listings: printing can be done better and more flexibly by a separate program.If you open up a manpage for ls on mac, you ll see that it starts withls [-ABCFGHLOPRSTUW@abcdefghiklmnopqrstuwx1] [file ...]That is, the one-letter flags to ls include every lowercase letter except for {jvyz}, 14 uppercase letters, plus @ and 1. That s 22 + 14 + 2 = 38 single-character options alone.On ubuntu 17, if you read the manpage for coreutils ls, you don t get a nice summary of options, but you ll see that ls has 58 options (including --help and --version).To see if ls is an aberration or if it's normal to have commands that do this much stuff, we can look at some common commands, sorted by frequency of use.table {border-collapse:collapse;margin:0px auto;}table,th,td {border: 1px solid black;}td {text-align:center;}command1979199620152017ls11425858rm371112mkdir0467mv091314cp0183032cat1121212pwd0244chmod0699echo1455man5163940which011sudo02325tar1253134139touch191111clear000find14578282ln0111516ps4228585ping121229kill1333ifconfig162525chown061515grep11224545tail171213df0101718top61214This table has the number of command line options for various commands for v7 Unix (1979), slackware 3.1 (1996), ubuntu 12 (2015), and ubuntu 17 (2017). Cells are darker and blue-er when they have more options (log scale) and are greyed out if no command was found.We can see that the number of command line options has dramatically increased over time; entries tend to get darker going to the right (more options) and there are no cases where entries get lighter (fewer options). McIlroy has long decried the increase in the number of options, size, and general functionality of commands1:Everything was small and my heart sinks for Linux when I see the size [inaudible]. The same utilities that used to fit in eight k[ilobytes] are a meg now. And the manual page, which used to really fit on, which used to really be a manual page, is now a small volume with a thousand options... We used to sit around in the UNIX room saying "what can we throw out? Why is there this option?" It's usually, it's often because there's some deficiency in the basic design you didn't really hit the right design point. Instead of putting in an option, figure out why, what was forcing you to add that option. This viewpoint, which was imposed partly because there was very small hardware ... has been lost and we're not better off for it.Ironically, one of the reasons for the rise in the number of command line options is another McIlroy dictum, "Write programs to handle text streams, because that is a universal interface" (see ls for one example of this).If structured data or objects were passed around, formatting could be left to a final formatting pass. But, with plain text, the formatting and the content are intermingled; because formatting can only be done by parsing the content out, it's common for commands to add formatting options for convenience. Alternately, formatting can be done when the user leverages their knowledge of the structure of the data and encodes that knowledge into arguments to cut, awk, sed, etc. (also using their knowledge of how those programs handle formatting; it's different for different programs and the user is expected to, for example, know how cut -f4 is different from awk '{ print $4 }'2). That's a lot more hassle than passing in one or two arguments to the last command in a sequence and it pushes the complexity from the tool to the user.People sometimes say that they don't want to support structured data because they'd have to support multiple formats to make a universal tool, but they already have to support multiple formats to make a universal tool. Some standard commands can't read output from other commands because they use different formats, wc -w doesn't handle Unicode correctly, etc. Saying that "text" is a universal format is like saying that "binary" is a universal format.I've heard people say that there isn't really any alternative to this kind of complexity for command line tools, but people who say that have never really tried the alternative, something like PowerShell. I have plenty of complaints about PowerShell, but passing structured data around and easily being able to operate on structured data without having to hold metadata information in my head so that I can pass the appropriate metadata to the right command line tools at that right places the pipeline isn't among my complaints3.The sleight of hand that's happening when someone says that we can keep software simple and compatible by making everything handle text is the pretense that text data doesn't have a structure that needs to be parsed4. In some cases, we can just think of everything as a single space separated line, or maybe a table with some row and column separators that we specify (with some behavior that isn't consistent across tools, of course). That adds some hassle when it works, and then there are the cases where serializing data to a flat text format adds considerable complexity since the structure of data means that simple flattening requires significant parsing work to re-ingest the data in a meaningful way.Another reason commands now have more options is that people have added convenience flags for functionality that could have been done by cobbling together a series of commands. These go all the way back to v7 unix, where ls has an option to reverse the sort order (which could have been done by passing the output to something like tac had they written tac instead of adding a special-case reverse option).Over time, more convenience options have been added. For example, to pick a command that originally has zero options, mv can move and create a backup (three options; two are different ways to specify a backup, one of which takes an argument and the other of which takes zero explicit arguments and reads an implicit argument from the VERSION_CONTROL environment variable; one option allows overriding the default backup suffix). mv now also has options to never overwrite and to only overwrite if the file is newer.mkdir is another program that used to have no options where, excluding security things for SELinux or SMACK as well as help and version options, the added options are convenience flags: setting the permissions of the new directory and making parent directories if they don't exist.If we look at tail, which originally had one option (-number, telling tail where to start), it's added both formatting and convenience options For formatting, it has -z, which makes the line delimiter null instead of a newline. Some examples of convenience options are -f to print when there are new changes, -s to set the sleep interval between checking for -f changes, --retry to retry if the file isn't accessible.McIlroy says "we're not better off" for having added all of these options but I'm better off. I've never used some of the options we've discussed and only rarely use others, but that's the beauty of command line options unlike with a GUI, adding these options doesn't clutter up the interface. The manpage can get cluttered, but in the age of google and stackoverflow, I suspect many people just search for a solution to what they're trying to do without reading the manpage anyway.This isn't to say there's no cost to adding options more options means more maintenance burden, but that's a cost that maintainers pay to benefit users, which isn't obviously unreasonable considering the ratio of maintainers to users. This is analogous to Gary Bernhardt's comment that it's reasonable to practice a talk fifty times since, if there's a three hundred person audience, the ratio of time spent watching to the talk to time spent practicing will still only be 1:6. In general, this ratio will be even more extreme with commonly used command line tools.Someone might argue that all these extra options create a burden for users. That's not exactly wrong, but that complexity burden was always going to be there, it's just a question of where the burden was going to lie. If you think of the set of command line tools along with a shell as forming a language, a language where anyone can write a new method and it effectively gets added to the standard library if it becomes popular, where standards are defined by dicta like "write programs to handle text streams, because that is a universal interface", the language was always going to turn into a write-only incoherent mess when taken as a whole. At least with tools that bundle up more functionality and options than is UNIX-y users can replace a gigantic set of wildly inconsistent tools with a merely large set of tools that, while inconsistent with each other, may have some internal consistency.McIlroy implies that the problem is that people didn't think hard enough, the old school UNIX mavens would have sat down in the same room and thought longer and harder until they came up with a set of consistent tools that has "unusual simplicity". But that was never going to scale, the philosophy made the mess we're in inevitable. It's not a matter of not thinking longer or harder; it's a matter of having a philosophy that cannot scale unless you have a relatively small team with a shared cultural understanding, able to to sit down in the same room.If anyone can write a tool and the main instruction comes from "the unix philosophy", people will have different opinions about what "simplicity" or "doing one thing"5 means, what the right way to do things is, and inconsistency will bloom, resulting in the kind of complexity you get when dealing with a wildly inconsistent language, like PHP. People make fun of PHP and javascript for having all sorts of warts and weird inconsistencies, but as a language and a standard library, any commonly used shell plus the collection of widely used *nix tools taken together is much worse and contains much more accidental complexity due to inconsistency even within a single Linux distro and there's no other way it could have turned out. If you compare across Linux distros, BSDs, Solaris, AIX, etc., the amount of accidental complexity that users have to hold in their heads when switching systems dwarfs PHP or javascript's incoherence. The most widely mocked programming languages are paragons of great design by comparison.To be clear, I'm not saying that I or anyone else could have done better with the knowledge available in the 70s in terms of making a system that was practically useful at the time that would be elegant today. It's easy to look back and find issues with the benefit of hindsight. What I disagree with are comments from Unix mavens speaking today; comments like McIlroy's, which imply that we just forgot or don't understand the value of simplicity, or Ken Thompson saying that C is as safe a language as any and if we don't want bugs we should just write bug-free code. These kinds of comments imply that there's not much to learn from hindsight; in the 70s, we were building systems as effectively as anyone can today; five decades of collective experience, tens of millions of person-years, have taught us nothing; if we just go back to building systems like the original Unix mavens did, all will be well. I respectfully disagree.Appendix: memoryAlthough addressing McIlroy's complaints about binary size bloat is a bit out of scope for this, I will note that, in 2017, I bought a Chromebook that had 16GB of RAM for $300. A 1 meg binary might have been a serious problem in 1979, when a standard Apple II had 4KB. An Apple II cost $1298 in 1979 dollars, or $4612 in 2020 dollars. You can get a low end Chromebook that costs less than 1/15th as much which has four million times more memory. Complaining that memory usage grew by a factor of one thousand when a (portable!) machine that's more than an order of magnitude cheaper has four million times more memory seems a bit ridiculous.I prefer slimmer software, which is why I optimized my home page down to two packets (it would be a single packet if my CDN served high-level brotli), but that's purely an aesthetic preference, something I do for fun. The bottleneck for command line tools isn't memory usage and spending time optimizing the memory footprint of a tool that takes one meg is like getting a homepage down to a single packet. Perhaps a fun hobby, but not something that anyone should prescribe.Methodology for tableCommand frequencies were sourced from public command history files on github, not necessarily representative of your personal usage. Only "simple" commands were kept, which ruled out things like curl, git, gcc (which has > 1000 options), and wget. What's considered simple is arbitrary. Shell builtins, like cd weren't included.Repeated options aren't counted as separate options. For example, git blame -C, git blame -C -C, and git blame -C -C -C have different behavior, but these would all be counted as a single argument even though -C -C is effectively a different argument from -C.The table counts sub-options as a single option. For example, ls has the following:--format=WORDacross -x, commas -m, horizontal -x, long -l, single-column -1, verbose -l, vertical -CEven though there are seven format options, this is considered to be only one option.Options that are explicitly listed as not doing anything are still counted as options, e.g., ls -g, which reads Ignored; for Unix compatibility. is counted as an option.Multiple versions of the same option are also considered to be one option. For example, with ls, -A and --almost-all are counted as a single option.In cases where the manpage says an option is supposed to exist, but doesn't, the option isn't counted. For example, the v7 mv manpage saysBUGSIf file1 and file2 lie on different file systems, mv must copy the file and delete the original. In this case the owner name becomes that of the copying process and any linking relationship with other files is lost.Mv should take -f flag, like rm, to suppress the question if the target exists and is not writable.-f isn't counted as a flag in the table because the option doesn't actually exist.The latest year in the table is 2017 because I wrote the first draft for this post in 2017 and didn't get around to cleaning it up until 2020.Relatedmjd on the Unix philosophy, with an aside into the mess of /usr/bin/time vs. built-in time.mjd making a joke about the proliferation of command line options in 1991.On HN:p1mrx:It's strange that ls has grown to 58 options, but still can't output \0-terminated filenamesAs an exercise, try to sort a directory by size or date, and pass the result to xargs, while supporting any valid filename. I eventually just gave up and made my script ignore any filenames containing \n.whelming_wave:Here you go: sort all files in the current directory by modification time, whitespace-in-filenames-safe.The printf (od -> sed)' construction converts back out of null-separated characters into newline-separated, though feel free to replace that with anything accepting null-separated input. Granted,sort --zero-terminated' is a GNU extension and kinda cheating, but it's even available on macOS so it's probably fine. printf '%b' $( find . -maxdepth 1 -exec sh -c ' printf '\''%s %s\0'\'' "$(stat -f '\''%m'\'' "$1")" "$1" ' sh {} \; | \ sort --zero-terminated | \ od -v -b | \ sed 's/^[^ ]*// s/ *$// s/ */ \\/g s/\\000/\\012/g')If you're running this under zsh, you'll need to prefix it with `command' to use the system executable: zsh's builtin printf doesn't support printing octal escape codes for normally printable characters, and you may have to assign the output to a variable and explicitly word-split it.This is all POSIX as far as I know, except for the sort.The Unix haters handbook.Why create a new shell?Thanks to Leah Hanson, Hillel Wayne, Wesley Aptekar-Cassels, Mark Jason Dominus, Travis Downs, and Yuri Vishnevsky for comments/corrections/discussion.This quote is slightly different than the version I've seen everywhere because I watched the source video. AFAICT, every copy of this quote that's on the internet (indexed by Bing, DuckDuckGo, or Google) is a copy of one person's transcription of the quote. There's some ambiguity because the audio is low quality and I hear something a bit different than whoever transcribed that quote heard. [return]Another example of something where the user absorbs the complexity because different commands handle formatting differently is time formatting the shell builtin time is, of course, inconsistent with /usr/bin/time and the user is expected to know this and know how to handle it. [return]Just for example, you can use ConvertTo-Json or ConvertTo-CSV on any object, you can use cmdlets to change how properties are displayed for objects, and you can write formatting configuration files that define how you prefer things to be formatted.Another way to look at this is through the lens of Conway's law. If we have a set of command line tools that are built by different people, often not organizationally connected, the tools are going to be wildly inconsistent unless someone can define a standard and get people to adopt it. This actually works relatively well on Windows, and not just in PowerShell.A common complaint about Microsoft is that they've created massive API churn, often for non-technical organizational reasons (e.g., a Sinofsky power play, like the one described in the replies to the now-deleted Tweet at https://twitter.com/stevesi/status/733654590034300929). It's true. Even so, from the standpoint of a naive user, off-the-shelf Windows software is generally a lot better at passing non-textual data around than *nix. One thing this falls out of is Windows's embracing of non-textual data, which goes back at least to COM in 1999 (and arguably OLE and DDE, released in 1990 and 1987, respectively).For example, if you copy from Foo, which supports binary formats A and B, into Bar, which supports formats B and C and you then copy from Bar into Baz, which supports C and D, this will work even though Foo and Baz have no commonly supported formats. When you cut/copy something, the application basically "tells" the clipboard what formats it could provide data in. When you paste into the application, the destination application can request the data in any of the formats in which it's available. If the data is already in the clipboard, "Windows" provides it. If it isn't, Windows gets the data from the source application and then gives to the destination application and a copy is saved for some length of time in Windows. If you "cut" from Excel it will tell "you" that it has the data available in many tens of formats. This kind of system is pretty good for compatibility, although it definitely isn't simple or minimal.In addition to nicely supporting many different formats and doing so for long enough that a lot of software plays nicely with this, Windows also generally has nicer clipboard support out of the box.Let's say you copy and then paste a small amount of text. Most of the time, this will work like you'd expect on both Windows and Linux. But now let's say you copy some text, close the program you copied from, and then paste it. A mental model that a lot of people have is that when they copy, the data is stored in the clipboard, not in the program being copied from. On Windows, software is typically written to conform to this expectation (although, technically, users of the clipboard API don't have to do this). This is less common on Linux with X, where the correct mental model for most software is that copying stores a pointer to the data, which is still owned by the program the data was copied from, which means that paste won't work if the program is closed. When I've (informally) surveyed programmers, they're usually surprised by this if they haven't actually done copy+paste related work for an application. When I've surveyed non-programmers, they tend to find the behavior to be confusing as well as surprising.The downside of having the OS effectively own the contents of the clipboard is that it's expensive to copy large amounts of data. Let's say you copy a really large amount of text, many gigabytes, or some complex object and then never paste it. You don't really want to copy that data from your program into the OS so that it can be available. Windows also handles this reasonably: applications can provide data only on request when that's deemed advantageous. In the case mentioned above, when someone closes the program, the program can decide whether or not it should push that data into the clipboard or discard it. In that circumstance, a lot of software (e.g., Excel) will prompt to "keep" the data in the clipboard or discard it, which is pretty reasonable.It's not impossible to support some of this on Linux. For example, the ClipboardManager spec describes a persistence mechanism and GNOME applications generally kind of sort of support it (although there are some bugs) but the situation on *nix is really different from the more pervasive support Windows applications tend to have for nice clipboard behavior. [return]Another example of this are tools that are available on top of modern compilers. If we go back and look at McIlroy's canonical example, how proper UNIX compilers are so specialized that listings are a separate tool, we can see that this has changed even if there's still a separate tool you can use for listings. Some commonly used Linux compilers have literally thousands of options and do many things. For example, one of the many things clang now does is static analysis. As of this writing, there are 79 normal static analysis checks and 44 experimental checks. If these were separate commands (perhaps individual commands or perhaps a static_analysis command, they'd still rely on the same underlying compiler infrastructure and impose the same maintenance burden it's not really reasonable to have these static analysis tools operate on plain text and reimplement the entire compiler toolchain necessary to get the point where they can do static analysis. They could be separate commands instead of bundled into clang, but they'd still take a dependency on the same machinery that's used for the compiler and either impose a maintenance and complexity burden on the compiler (which has to support non-breaking interfaces for the tools built on top) or they'd break all the time.Just make everything text so that it's simple makes for a nice soundbite, but in reality the textual representation of the data is often not what you want if you want to do actually useful work.And on clang in particular, whether you make it a monolithic command or thousands of smaller commands, clang simply does more than any compiler that existed in 1979 or even all compilers that existed in 1979 combined. It's easy to say that things were simpler in 1979 and that us modern programmers have lost our way. It's harder to actually propose a design that's actually much simpler and could really get adopted. It's impossible that such a design could maintain all of the existing functionality and configurability and be as simple as something from 1979. [return]Since its inception, curl has gone from supporting 3 protocols to 40. Does that mean it does 40 things and it would be more "UNIX-y" to split it up into 40 separate commands? Depends on who you ask. If each protocol were its own command, created and maintained by a different person, we'd be in the same situation we are with other commands. Inconsistent command line options, inconsistent output formats despite it all being text streams, etc. Would that be closer to the simplicity McIlroy advocates for? Depends on who you ask. [return]
If you read any personal finance forums late last year, there's a decent chance you ran across a question from someone who was desperately trying to lose money before the end of the year. There are a number of ways someone could do this; one commonly suggested scheme was to buy put options that were expected to expire worthless, allowing the buyer to (probably) take a loss.One reason people were looking for ways to lose money was that, in the U.S., there's a hard income cutoff for a health insurance subsidy at $48,560 for individuals (higher for larger households; $100,400 for a family of four). There are a number of factors that can cause the details to vary (age, location, household size, type of plan), but across all circumstances, it wouldn't have been uncommon for an individual going from one side of the cut-off to the other to have their health insurance cost increase by roughly $7200/yr. That means if an individual buying ACA insurance was going to earn $55k, they'd be better off reducing their income by $6440 and getting under the $48,560 subsidy ceiling than they are earning $55k.Although that's an unusually severe example, U.S. tax policy is full of discontinuities that disincentivize increasing earnings and, in some cases, actually incentivize decreasing earnings. Some other discontinuities are the TANF income limit, the Medicaid income limit, the CHIP income limit for free coverage, and the CHIP income limit for reduced-cost coverage. These vary by location and circumstance; the TANF and Medicaid income limits fall into ranges generally considered to be "low income" and the CHIP limits fall into ranges generally considered to be "middle class". These subsidy discontinuities have the same impact as the ACA subsidy discontinuity -- at certain income levels, people are incentivized to lose money.Anyone may arrange his affairs so that his taxes shall be as low as possible; he is not bound to choose that pattern which best pays the treasury. There is not even a patriotic duty to increase one's taxes. Over and over again the Courts have said that there is nothing sinister in so arranging affairs as to keep taxes as low as possible. Everyone does it, rich and poor alike and all do right, for nobody owes any public duty to pay more than the law demands.If you agree with the famous Learned Hand quote then losing money in order to reduce effective tax rate, increasing disposable income, is completely legitimate behavior at the individual level. However, a tax system that encourages people to lose money, perhaps by funneling it to (on average) much wealthier options traders by buying put options, seems sub-optimal.A simple fix for the problems mentioned above would be to have slow phase-outs instead of sharp thresholds. Slow phase-outs are actually done for some subsidies and, while that can also have problems, they are typically less problematic than introducing a sharp discontinuity in tax/subsidy policy.In this post, we'll look at a variety of discontinuities.Hardware or software queuesA naive queue has discontinuous behavior. If the queue is full, new entries are dropped. If the queue isn't full, new entries are not dropped. Depending on your goals, this can often have impacts that are non-ideal. For example, in networking, a naive queue might be considered "unfair" to bursty workloads that have low overall bandwidth utilization because workloads that have low bandwidth utilization "shouldn't" suffer more drops than workloads that are less bursty but use more bandwidth (this is also arguably not unfair, depending on what your goals are).A class of solutions to this problem are random early drop and its variants, which gives incoming items a probability of being dropped which can be determined by queue fullness (and possibly other factors), smoothing out the discontinuity and mitigating issues caused by having a discontinuous probability of queue drops.This post on voting in link aggregators is fundamentally the same idea although, in some sense, the polarity is reversed. There's a very sharp discontinuity in how much traffic something gets based on whether or not it's on the front page. You could view this as a link getting dropped from a queue if it only receives N-1 votes and not getting dropped if it receives N votes.College admissions and Pell Grant recipientsPell Grants started getting used as a proxy for how serious schools are about helping/admitting low-income students. The first order impact is that students above the Pell Grant threshold had a significantly reduced probability of being admitted while students below the Pell Grant threshold had a significantly higher chance of being admitted. Phrased that way, it sounds like things are working as intended.However, when we look at what happens within each group, we see outcomes that are the opposite of what we'd want if the goal is to benefit students from low income families. Among people who don't qualify for a Pell Grant, it's those with the lowest income who are the most severely impacted and have the most severely reduced probability of admission. Among people who do qualify, it's those with the highest income who are mostly likely to benefit, again the opposite of what you'd probably want if your goal is to benefit students from low income families.We can see these in the graphs below, which are histograms of parental income among students at two universities in 2008 (first graph) and 2016 (second graph), where the red line indicates the Pell Grant threshold.A second order effect of universities optimizing for Pell Grant recipients is that savvy parents can do the same thing that some people do to cut their taxable income at the last minute. Someone might put money into a traditional IRA instead of a Roth IRA and, if they're at their IRA contribution limit, they can try to lose money on options, effectively transferring money to options traders who are likely to be wealthier than them, in order to bring their income below the Pell Grant threshold, increasing the probability that their children will be admitted to a selective school.Election statisticsThe following histograms of Russian elections across polling stations shows curious spikes in turnout and results at nice, round, numbers (e.g., 95%) starting around 2004. This appears to indicate that there's election fraud via fabricated results and that at least some of the people fabricating results don't bother with fabricating results that have a smooth distribution.For finding fraudulent numbers, also see, Benford's law.p-valuesAuthors of psychology papers are incentivized to produce papers with p values below some threshold, usually 0.05, but sometimes 0.1 or 0.01. Masicampo et al. plotted p values from papers published in three psychology journals and found a curiously high number of papers with p values just below 0.05.The spike at p = 0.05 consistent with a number of hypothesis that aren't great, such as:Authors are fudging results to get p = 0.05Journals are much more likely to accept a paper with p = 0.05 than if p = 0.055Authors are much less likely to submit results if p = 0.055 than if p = 0.05Head et al. (2015) surveys the evidence across a number of fields.Andrew Gelman and others have been campaigning to get rid of the idea of statistical significance and p-value thresholds for years, see this paper for a short summary of why. Not only would this reduce the incentive for authors to cheat on p values, there are other reasons to not want a bright-line rule to determine if something is "significant" or not.Drug chargesThe top two graphs in this set of four show histograms of the amount of cocaine people were charged with possessing before and after the passing of the Fair Sentencing Act in 2010, which raised the amount of cocaine necessary to trigger the 10-year mandatory minimum prison sentence for possession from 50g to 280g. There's a relatively smooth distribution before 2010 and a sharp discontinuity after 2010.The bottom-left graph shows the sharp spike in prosecutions at 280 grams followed by what might be a drop in 2013 after evidentiary standards were changed1.Birth month and sportsThese are scatterplots of football (soccer) players in the UEFA Youth League. The x-axis on both of these plots is how old players are modulo the year, i.e., their birth month normalized from 0 to 1.The graph on the left is a histogram, which shows that there is a very strong relationship between where a person's birth falls within the year and their odds of making a club at the UEFA Youth League (U19) level. The graph on the right purports to show that birth time is only weakly correlated with actual value provided on the field. The authors use playing time as a proxy for value, presumably because it's easy to measure. That's not a great measure, but the result they find (younger-within-the-year players have higher value, conditional on making the U19 league) is consistent with other studies on sports and discrimination, which ind (for example) that black baseball players were significantly better than white baseball players for decades after desegregation in baseball, French-Canadian defensemen are also better than average (French-Canadians are stereotypically afraid to fight, don't work hard enough, and are too focused on offense).The discontinuity isn't directly shown in the graphs above because the graphs only show birth date for one year. If we were to plot birth date by cohort across multiple years, we'd expect to see a sawtooth pattern in the probability that a player makes it into the UEFA youth league with a 10x difference between someone born one day before vs. after the threshold.This phenomenon, that birth day or month is a good predictor of participation in higher-level youth sports as well as pro sports, has been studied across a variety of sports.It's generally believed that this is caused by a discontinuity in youth sports:Kids are bucketed into groups by age in years and compete against people in the same yearWithin a given year, older kids are stronger, faster, etc., and perform betterThis causes older-within-year kids to outcompete younger kids, which later results in older-within-year kids having higher levels of participation for a variety of reasonsThis is arguably a "bug" in how youth sports works. But as we've seen in baseball as well as a survey of multiple sports, obviously bad decision making that costs individual teams tens or even hundreds of millions of dollars can persist for decades in the face of people pubicly discussing how bad the decisions are. In this case, the youth sports teams aren't feeder teams to pro teams, so they don't have a financial incentive to select players who are skilled for their age (as opposed to just taller and faster because they're slightly older) so this system-wide non-optimal even more difficult to fix than pro sports teams making locally non-optimal decisions that are completely under their control.High school exit exam scoresThis is a histogram of high school exit exam scores from the Polish language exam. We can see that a curiously high number of students score 30 or just above thirty while curiously low number of students score from 23-29. This is from 2013; other years I've looked at (2010-2012) show a similar discontinuity.Math exit exam scores don't exhibit any unusual discontinuities in the years I've examined (2010-2013).An anonymous reddit commenter explains this:When a teacher is grading matura (final HS exam), he/she doesn't know whose test it is. The only things that are known are: the number (code) of the student and the district which matura comes from (it is usually from completely different part of Poland). The system is made to prevent any kind of manipulation, for example from time to time teachers supervisor will come to check if test are graded correctly. I don't wanna talk much about system flaws (and advantages), it is well known in every education system in the world where final tests are made, but you have to keep in mind that there is a key, which teachers follow very strictly when grading.So, when a score of the test is below 30%, exam is failed. However, before making final statement in protocol, a commision of 3 (I don't remember exact number) is checking test again. This is the moment, where difference between humanities and math is shown: teachers often try to find a one (or a few) missing points, so the test won't be failed, because it's a tragedy to this person, his school and somewhat fuss for the grading team. Finding a "missing" point is not that hard when you are grading writing or open questions, which is a case in polish language, but nearly impossible in math. So that's the reason why distribution of scores is so different.As with p values, having a bright-line threshold, causes curious behavior. In this case, scoring below 30 on any subject (a 30 or above is required in every subject) and failing the exam has arbitrary negative effects for people, so teachers usually try to prevent people from failing if there's an easy way to do it, but a deeper root of the problem is the idea that it's necessary to produce a certification that's the discretization of a continuous score.Procurement auctionsKawai et al. looked at Japanese government procurement, in order to find suspicious pattern of bids like the ones described in Porter et al. (1993), which looked at collusion in procurement auctions on Long Island (in New York in the United States). One example that's given is:In February 1983, the New York State Department of Transportation (DoT) held a pro- curement auction for resurfacing 0.8 miles of road. The lowest bid in the auction was $4 million, and the DoT decided not to award the contract because the bid was deemed too high relative to its own cost estimates. The project was put up for a reauction in May 1983 in which all the bidders from the initial auction participated. The lowest bid in the reauction was 20% higher than in the initial auction, submitted by the previous low bidder. Again, the contract was not awarded. The DoT held a third auction in February 1984, with the same set of bidders as in the initial auction. The lowest bid in the third auction was 10% higher than the second time, again submitted by the same bidder. The DoT apparently thought this was suspicious: It is notable that the same firm submitted the low bid in each of the auctions. Because of the unusual bidding patterns, the contract was not awarded through 1987. It could be argued that this is expected because different firms have different cost structures, so the lowest bidder in an auction for one particular project should be expected to be the lowest bidder in subsequent auctions for the same project. In order to distinguish between collusion and real structural cost differences between firms, Kawai et al. (2015) looked at auctions where the difference in bid between the first and second place firms was very small, making the winner effectively random.In the auction structure studied, bidders submit a secret bid. If the secret bid is above a secret minimum, then the lowest bidder wins the auction and gets the contract. If not, the lowest bid is revealed to all bidders and another round of bidding is done. Kawai et al. found that, in about 97% of auctions, the bidder who submitted the lowest bid in the first round also submitted the lowest bid in the second round (the probability that the second lowest bidder remains second lowest was 26%).Below, is a histogram of the difference in first and second round bids between the first-lowest and second-lowest bidders (left column) and the second-lowest and third-lowest bidders (right column). Each row has a different filtering criteria for how close the auction has to be in order to be included. In the top row, all auctions that reached the third round were included; in second, and third rows, the normalized delta between the first and second biders was less than 0.05 and 0.01, respectively; in the last row, the normalized delta between the first and the third bidder was less than 0.03. All numbers are normalized because the absolute size of auctions can vary.We can see that the distributions of deltas between the first and second round are roughly symmetrical when comparing second and third lowest bidders. But when comparing first and second lowest bidders, there's a sharp discontinuity at zero, indicating that second-lowest bidder almost never lowers their bid by more than the first-lower bidder did. If you read the paper, you can see that the same structure persists into auctions that go into a third round.I don't mean to pick on Japanese procurement auctions in particular. There's an extensive literature on procurement auctions that's found collusion in many cases, often much more blatant than the case presented above (e.g., there are a few firms and they round-robin who wins across auctions, or there are a handful of firms and every firm except for the winner puts in the same losing bid).Restaurant inspection scoresThe histograms below show a sharp discontinuity between 13 and 14, which is the difference between an A grade and a B grade. It appears that some regions also have a discontinuity between 27 and 28, which is the difference between a B and a C and this older analysis from 2014 found what appears to be a similar discontinuity between B and C grades.Inspectors have discretion in what violations are tallied and it appears that there are cases where restaurant are nudged up to the next higher grade.Marathon finishing timesA histogram of marathon finishing times (finish times on the x-axis, count on the y-axis) across 9,789,093 finishes shows noticeable discontinuities at every half hour, as well as at "round" times like :10, :15, and :20.An analysis of times within each race (see section 4.4, figures 7-9) indicates that this is at least partially because people speed up (or slow down less than usual) towards the end of races if they're close to a "round" time2.NotesThis post doesn't really have a goal or a point, it's just a collection of discontinuities that I find fun.One thing that's maybe worth noting is that I've gotten a lot of mileage out in my career both out of being suspicious of discontinuities and figuring out where they come from and also out of applying standard techniques to smooth out discontinuities.For finding discontinuities, basic tools like "drawing a scatterplot", "drawing a histogram", "drawing the CDF" often come in handy. Other kinds of visualizations that add temporality, like flamescope, can also come in handy.We noted above that queues create a kind of discontinuity that, in some circumstances, should be smoothed out. We also noted that we see similar behavior for other kinds of thresholds and that randomization can be a useful tool to smooth out discontinuities in thresholds as well. Randomization can also be used to allow for reducing quantization error when reducing precision with ML and in other applications.Thanks to Leah Hanson, Omar Rizwan, Dmitry Belenko, Kamal Marhubi, Danny Vilea, Nick Roberts, Lifan Zeng, Wesley Aptekar-Cassels, Thomas Hauk, @BaudDev, and Michael Sullivan for comments/corrections/discussion.Also, please feel free to send me other interesting discontinuities!Most online commentary I've seen about this paper is incorrect. I've seen this paper used as evidence of police malfeasance because the amount of cocaine seized jumped to 280g. This is the opposite of what's described in the paper, where the author notes that, based on drug seizure records, amounts seized do not appear to be the cause of this change. After noting that drug seizures are not the cause, the author notes that prosecutors can charge people for amounts that are not the same as the amount seized and then notes:I do find bunching at 280g after 2010 in case management data from the Executive Office of the US Attorney (EOUSA). I also find that approximately 30% of prosecutors are responsible for the rise in cases with 280g after 2010, and that there is variation in prosecutor-level bunching both within and between districts. Prosecutors who bunch cases at 280g also have a high share of cases right above 28g after 2010 (the 5-year threshold post-2010) and a high share of cases above 50g prior to 2010 (the 10-year threshold pre-2010). Also, bunching above a mandatory minimum threshold persists across districts for prosecutors who switch districts. Moreover, when a bunching prosecutor switches into a new district, all other attorneys in that district increase their own bunching at mandatory minimums. These results suggest that the observed bunching at sentencing is specifically due to prosecutorial discretionThis is mentioned in the abstract and then expounded on in the introduction (the quoted passage is from the introduction), so I think that most people commenting on this paper can't have read it. I've done a few surveys of comments on papers on blog posts and I generally find that, in cases where it's possible to identify this (e.g., when the post is mistitled), the vast majority of commenters can't have read the paper or post they're commenting on, but that's a topic for another post.There is some evidence that something fishy may be going on in seizures (e.g., see Fig. A8.(c)), but if the analysis in the paper is correct, that impact of that is much smaller than the impact of prosecutorial discretion. [return]One of the most common comments I've seen online about this graph and/or this paper is that this is due to pace runners provided by the marathon. Section 4.4 of the paper gives multiple explanations for why this cannot be the case, once again indicating that people tend to comment without reading the paper. [return]
Reaching 95%-ile isn't very impressive because it's not that hard to do. I think this is one of my most ridiculable ideas. It doesn't help that, when stated nakedly, that sounds elitist. But I think it's just the opposite: most people can become (relatively) good at most things.Note that when I say 95%-ile, I mean 95%-ile among people who participate, not all people (for many activities, just doing it at all makes you 99%-ile or above across all people). I'm also not referring to 95%-ile among people who practice regularly. The "one weird trick" is that, for a lot of activities, being something like 10%-ile among people who practice can make you something like 90%-ile or 99%-ile among people who participate.This post is going to refer to specifics since the discussions I've seen about this are all in the abstract, which turns them into Rorschach tests. For example, Scott Adams has a widely cited post claiming that it's better to be a generalist than a specialist because, to become "extraordinary", you have to either be "the best" at one thing or 75%-ile at two things. If that were strictly true, it would surely be better to be a generalist, but that's of course exaggeration and it's possible to get a lot of value out of a specialized skill without being "the best"; since the precise claim, as written, is obviously silly and the rest of the post is vague handwaving, discussions will inevitably devolve into people stating their prior beliefs and basically ignoring the content of the post.Personally, in every activity I've participated in where it's possible to get a rough percentile ranking, people who are 95%-ile constantly make mistakes that seem like they should be easy to observe and correct. "Real world" activities typically can't be reduced to a percentile rating, but achieving what appears to be a similar level of proficiency seems similarly easy.We'll start by looking at Overwatch (a video game) in detail because it's an activity I'm familiar with where it's easy to get ranking information and observe what's happening, and then we'll look at some "real world" examples where we can observe the same phenomena, although we won't be able to get ranking information for real world examples1.OverwatchAt 90%-ile and 95%-ile ranks in Overwatch, the vast majority of players will pretty much constantly make basic game losing mistakes. These are simple mistakes like standing next to the objective instead of on top of the objective while the match timer runs out, turning a probable victory into a certain defeat. See the attached footnote if you want enough detail about specific mistakes that you can decide for yourself if a mistake is "basic" or not2.Some reasons we might expect this to happen are:People don't want to win or don't care about winningPeople understand their mistakes but haven't put in enough time to fix themPeople are untalentedPeople don't understand how to spot their mistakes and fix themIn Overwatch, you may see a lot of (1), people who don t seem to care about winning, at lower ranks, but by the time you get to 30%-ile, it's common to see people indicate their desire to win in various ways, such as yelling at players who are perceived as uncaring about victory or unskilled, complaining about people who they perceive to make mistakes that prevented their team from winning, etc.3. Other than the occasional troll, it's not unreasonable to think that people are generally trying to win when they're severely angered by losing.(2), not having put in time enough to fix their mistakes will, at some point, apply to all players who are improving, but if you look at the median time played at 50%-ile, people who are stably ranked there have put in hundreds of hours (and the median time played at higher ranks is higher). Given how simple the mistakes we're discussing are, not having put in enough time cannot be the case for most players.A common complaint among low-ranked Overwatch players in Overwatch forums is that they're just not talented and can never get better. Most people probably don't have the talent to play in a professional league regardless of their practice regimen, but when you can get to 95%-ile by fixing mistakes like "not realizing that you should stand on the objective", you don't really need a lot of talent to get to 95%-ile.While (4), people not understanding how to spot and fix their mistakes, isn't the only other possible explanation4, I believe it's the most likely explanation for most players. Most players who express frustration that they're stuck at a rank up to maybe 95%-ile or 99%-ile don't seem to realize that they could drastically improve by observing their own gameplay or having someone else look at their gameplay.One thing that's curious about this is that Overwatch makes it easy to spot basic mistakes (compared to most other activities). After you're killed, the game shows you how you died from the viewpoint of the player who killed you, allowing you to see what led to your death. Overwatch also records the entire game and lets you watch a replay of the game, allowing you to figure out what happened and why the game was won or lost. In many other games, you'd have to set up recording software to be able to view a replay.If you read Overwatch forums, you'll see a regular stream of posts that are basically "I'm SOOOOOO FRUSTRATED! I've played this game for 1200 hours and I'm still ranked 10%-ile, [some Overwatch specific stuff that will vary from player to player]". Another user will inevitably respond with something like "we can't tell what's wrong from your text, please post a video of your gameplay". In the cases where the original poster responds with a recording of their play, people will post helpful feedback that will immediately make the player much better if they take it seriously. If you follow these people who ask for help, you'll often see them ask for feedback at a much higher rank (e.g., moving from 10%-ile to 40%-ile) shortly afterwards. It's nice to see that the advice works, but it's unfortunate that so many players don't realize that watching their own recordings or posting recordings for feedback could have saved 1198 hours of frustration.It appears to be common for Overwatch players (well into 95%-ile and above) to:Want to improveNot get feedbackImprove slowly when getting feedback would make improving quickly easyOverwatch provides the tools to make it relatively easy to get feedback, but people who very strongly express a desire to improve don't avail themselves of these tools.Real lifeMy experience is that other games are similar and I think that "real life" activities aren't so different, although there are some complications.One complication is that real life activities tend not to have a single, one-dimensional, objective to optimize for. Another is that what makes someone good at a real life activity tends to be poorly understood (by comparison to games and sports) even in relation to a specific, well defined, goal.Games with rating systems are easy to optimize for: your meta-goal can be to get a high rating, which can typically be achieved by increasing your win rate by fixing the kinds of mistakes described above, like not realizing that you should step onto the objective. For any particular mistake, you can even make a reasonable guess at the impact on your win rate and therefore the impact on your rating.In real life, if you want to be (for example) "a good speaker", that might mean that you want to give informative talks that help people learn or that you want to give entertaining talks that people enjoy or that you want to give keynotes at prestigious conferences or that you want to be asked to give talks for $50k an appearance. Those are all different objectives, with different strategies for achieving them and for some particular mistake (e.g., spending 8 minutes on introducing yourself during a 20 minute talk), it's unclear what that means with respect to your goal.Another thing that makes games, at least mainstream ones, easy to optimize for is that they tend to have a lot of aficionados who have obsessively tried to figure out what's effective. This means that if you want to improve, unless you're trying to be among the top in the world, you can simply figure out what resources have worked for other people, pick one up, read/watch it, and then practice. For example, if you want to be 99%-ile in a trick-taking card game like bridge or spades (among all players, not subgroups like "ACBL players with masterpoints" or "people who regularly attend North American Bridge Championships"), you can do this by:learning the basics of the gamereading a beginner book on cardplaypracticing applying the materialIf you want to become a good speaker and you have a specific definition of a good speaker in mind, there still isn't an obvious path forward. Great speakers will give directly contradictory advice (e.g., avoid focusing on presentation skills vs. practice presentation skills). Relatively few people obsessively try to improve and figure out what works, which results in a lack of rigorous curricula for improving. However, this also means that it's easy to improve in percentile terms since relatively few people are trying to improve at all.Despite all of the caveats above, my belief is that it's easier to become relatively good at real life activities relative to games or sports because there's so little delibrate practice put into most real life activities. Just for example, if you're a local table tennis hotshot who can beat every rando at a local bar, when you challenge someone to a game and they say "sure, what's your rating?" you know you're in shellacking by someone who can probably beat you while playing with a shoe brush. You're probably 99%-ile, but someone with no talent who's put in the time to practice the basics is going to have a serve that you can't return as well as be able to kill any shot a local bar expert is able to consitently hit. In most real life activities, there's almost no one who puts in a level of delibrate practice equivalent to someone who goes down to their local table tennis club and practices two hours a week, let alone someone like a top pro, who might seriously train for four hours a day.To give a couple of concrete examples, I helped Leah prepare for talks from 2013 to 2017. The first couple practice talks she gave were about the same as you'd expect if you walked into a random talk at a large tech conference. For the first couple years she was speaking, she did something like 30 or so practice runs for each public talk, of which I watched and gave feedback on half. Her first public talk was (IMO) well above average for a talk at a large, well regarded, tech conference and her talks got better from there until she stopped speaking in 2017.As we discussed above, this is more subjective than game ratings and there's no way to really determine a percentile, but if you look at how most people prepare for talks, it's not too surprising that Leah was above average. At one of the first conferences she spoke at, the night before the conference, we talked to another speaker who mentioned that they hadn't finished their talk yet and only had fifteen minutes of material (for a forty minute talk). They were trying to figure out how to fill the rest of the time. That kind of preparation isn't unusual and the vast majority of talks prepared like that aren't great.Most people consider doing 30 practice runs for a talk to be absurd, a totally obsessive amount of practice, but I think Gary Bernhardt has it right when he says that, if you're giving a 30-minute talk to a 300 person audience, that's 150 person-hours watching your talk, so it's not obviously unreasonable to spend 15 hours practicing (and 30 practice runs will probably be less than 15 hours since you can cut a number of the runs short and/or repeatedly practice problem sections). One thing to note that this level of practice, considered obessive when giving a talk, still pales in comparison to the amount of time a middling table tennis club player will spend practicing.If you've studied pedagogy, you might say that the help I gave to Leah was incredibly lame. It's known that having laypeople try to figure out how to improve among themselves is among the worst possible ways to learn something, good instruction is more effective and having a skilled coach or teacher give one-on-one instruction is more effective still5. That's 100% true, my help was incredibly lame. However, most people aren't going to practice a talk more than a couple times and many won't even practice a single time (I don't have great data proving this, this is from informally polling speakers at conferences I've attended). This makes Leah's 30 practice runs an extraordinary amount of practice compared to most speakers, which resulted in a relatively good outcome even though we were using one of the worst possible techniques for improvement.My writing is another example. I'm not going to compare myself to anyone else, but my writing improved dramatically the first couple of years I wrote this blog just because I spent a little bit of effort on getting and taking feedback.Leah read one or two drafts of almost every post and gave me feedback. On the first posts, since neither one of us knew anything about writing, we had a hard time identifying what was wrong. If I had some awkward prose or confusing narrative structure, we'd be able to point at it and say "that looks wrong" without being able to describe what was wrong or suggest a fix. It was like, in the era before spellcheck, when you misspelled a word and could tell that something was wrong, but every permutation you came up with was just as wrong.My fix for that was to hire a professional editor whose writing I respected with the instructions "I don't care about spelling and grammar fixes, there are fundamental problems with my writing that I don't understand, please explain to me what they are"6. I think this was more effective than my helping Leah with talks because we got someone who's basically a professional coach involved. An example of something my editor helped us with was giving us a vocabulary we could use to discuss structural problems, the way design patterns gave people a vocabulary to talk about OO design.Back to this blog's regularly scheduled topic: programmingProgramming is similar to the real life examples above in that it's impossible to assign a rating or calculate percentiles or anything like that, but it is still possible to make significant improvements relative to your former self without too much effort by getting feedback on what you're doing.For example, here's one thing Michael Malis did:One incredibly useful exercise I ve found is to watch myself program. Throughout the week, I have a program running in the background that records my screen. At the end of the week, I ll watch a few segments from the previous week. Usually I will watch the times that felt like it took a lot longer to complete some task than it should have. While watching them, I ll pay attention to specifically where the time went and figure out what I could have done better. When I first did this, I was really surprised at where all of my time was going.For example, previously when writing code, I would write all my code for a new feature up front and then test all of the code collectively. When testing code this way, I would have to isolate which function the bug was in and then debug that individual function. After watching a recording of myself writing code, I realized I was spending about a quarter of the total time implementing the feature tracking down which functions the bugs were in! This was completely non-obvious to me and I wouldn t have found it out without recording myself. Now that I m aware that I spent so much time isolating which function a bugs are in, I now test each function as I write it to make sure they work. This allows me to write code a lot faster as it dramatically reduces the amount of time it takes to debug my code.In the past, I've spent time figuring out where time is going when I code and basically saw the same thing as in Overwatch, except instead of constantly making game-losing mistakes, I was constantly doing pointlessly time-losing things. Just getting rid of some of those bad habits has probably been at least a 2x productivity increase for me, pretty easy to measure since fixing these is basically just clawing back wasted time. For example, I noticed how I'd get distracted for N minutes if I read something on the internet when I needed to wait for two minutes, so I made sure to keep a queue of useful work to fill dead time (and if I was working on something very latency sensitive where I didn't want to task switch, I'd do nothing until I was done waiting).One thing to note here is that it's important to actually track what you're doing and not just guess at what you're doing. When I've recorded what people do and compare it to what they think they're doing, these are often quite different. It would generally be considered absurd to operate a complex software system without metrics or tracing, but it's normal to operate yourself without metrics or tracing, even though you're much more complex and harder to understand than the software you work on.Jonathan Tang has noted that choosing the right problem dominates execution speed. I don't disagree with that, but doubling execution speed is still decent win that's independent of selecting the right problem to work on and I don't think that discussing how to choose the right problem can be effectively described in the abstract and the context necessary to give examples would be much longer than the already too long Overwatch examples in this post, maybe I'll write another post that's just about that.Anyway, this is sort of an odd post for me to write since I think that culturally, we care a bit too much about productivity in the U.S., especially in places I've lived recently (NYC & SF). But at a personal level, higher productivity doing work or chores doesn't have to be converted into more work or chores, it can also be converted into more vacation time or more time doing whatever you value.And for games like Overwatch, I don't think improving is a moral imperative; there's nothing wrong with having fun at 50%-ile or 10%-ile or any rank. But in every game I've played with a rating and/or league/tournament system, a lot of people get really upset and unhappy when they lose even when they haven't put much effort into improving. If that's the case, why not put a little bit of effort into improving and spend a little bit less time being upset?Some meta-techniques for improvingGet feedback and practiceIdeally from an expert coach but, if not, this can be from a layperson or even yourself (if you have some way of recording/tracing what you're doing)Guided exercises or exercises with solutionsThis is very easy to find in books for "old" games, like chess or Bridge.For particular areas, you can often find series of books that have these, e.g., in math, books in the Springer Undergraduate Mathematics Series (SUMS) tend to have problems with solutionsAppendix: other most ridiculable ideasHere are the ideas I've posted about that were the most widely ridiculed at the time of the post:It's not uncommon for programmers at trendy tech companies to make $350k/yr or more (2015, stated number was $250k/yr at the time)Monorepos can be reasonable (2015)We should expect to see a lot more CPU bugs (2016)Markets are not incompatible with discrimination (2014)Computers are getting slower in some ways (2017)Empirical evidence on the benefit of types is almost non-existent (2014)It's reasonable to write technical posts on a subject that avoid domain-specific terminologyMy posts on compensation have the dubious distinction of being the posts most frequently called out both for being so obvious that they're pointless as well as for being laughably wrong. I suspect they're also the posts that have had the largest aggregate impact on people -- I've had a double digit number of people tell me one of the compensation posts changed their life and they now make $x00,000/yr more than they used to because they know it's possible to get a much higher paying job and I doubt that I even hear from 10% of the people who make a big change as a result of learning that it's possible to make a lot more money.When I wrote my first post on compensation, in 2015, I got ridiculed more for writing something obviously wrong than for writing something obvious, but the last few years things have flipped around. I still get the occasional bit of ridicule for being wrong when some corner of Twitter or a web forum that's well outside the HN/reddit bubble runs across my post, but the ratio of obviously wrong to obvious has probably gone from 20:1 to 1:5.Opinions on monorepos have also seen a similar change since 2015. Outside of some folks at big companies, monorepos used to be considered obviously stupid among people who keep up with trends, but this has really changed. Not as much as opinions on compensation, but enough that I'm now a little surprised when I meet a hardline anti-monorepo-er.Although it's taken longer for opinions to come around on CPU bugs, that's probably the post that now gets the least ridicule from the list above.That markets don't eliminate all discrimination is the one where opinions have come around the least. Hardline "all markets are efficient" folks aren't really convinced by academic work like Becker's The Economics of Discrimination or summaries like the evidence laid out in the post.The posts on computers having higher latency and the lack of empirical evidence of the benefit of types are the posts I've seen pointed to the most often to defend a ridiculable opinion. I didn't know when I started doing the work for either post and they both happen to have turned up evidence that's the opposite of the most common loud claims (there's very good evidence that advanced type systems improve safety in practice and of course computers are faster in every way, people who think they're slower are just indulging in nostalgia). I don't know if this has changed many opinion. However, I haven't gotten much direct ridicule for either post even though both posts directly state a position I see commonly ridiculed online. I suspect that's partially because both posts are empirical, so there's not much to dispute (though the post on discrimnation is also empirical, but it still gets its share of ridicule).The last idea in the list is more meta: no one directly tells me that I should use more obscure terminology. Instead, I get comments that I must not know much about X because I'm not using terms of art. Using terms of art is a common way to establish credibility or authority, but that's something I don't really believe in. Arguing from authority doesn't tell you anything; adding needless terminology just makes things more difficult for readers who aren't in the field and are reading because they're interested in the topic but don't want to actually get into the field.This is a pretty fundamental disagreement that I have with a lot of people. Just for example, I recently got into a discussion with an authority who insisted that it wasn't possible for me to reasonably disagree with them (I suggested we agree to disagree) because they're an authority on the topic and I'm not. It happens that I worked on the formal verification of a system very similar to the system we were discussing, but I didn't mention that because I don't believe that my status as an authority on the topic matters. If someone has such a weak argument that they have to fall back on an infallible authority, that's usually a sign that they don't have a well-reasoned defense of their position. This goes double when they point to themselves as the infallible authority.I have about 20 other posts on stupid sounding ideas queued up in my head, but I mostly try to avoid writing things that are controversial, so I don't know that I'll write many of those up. If I were to write one post a month (much more frequently than my recent rate) and limit myself to 10% posts on ridiculable ideas, it would take 16 years to write up all of the ridiculable ideas I currently have.Thanks to Leah Hanson, Hillel Wayne, Robert Schuessler, Michael Malis, Kevin Burke, Jeremie Jost, Pierre-Yves Baccou, Veit Heller, Jeff Fowler, Malte Skarupe, David Turner, Akiva Leffert, Lifan Zeng, John Hergenroder, Wesley Aptekar-Cassels, Chris Lample, Julia Evans, Anja Boskovic, Vaibhav Sagar, Sean Talts, Valentin Hartmann, Sean Barrett, Kevin Shannon, Enzo Ferey, Yuri Vishnevsky, and an anonymous commenter for comments/corrections/discussion.>The choice of Overwatch is arbitrary among activities I'm familiar with where:I know enough about the activity to comment on itI've observed enough people trying to learn it that I can say if it's "easy" or not to fix some mistake or class of mistakeThere's a large enough set of rated players is high enough to support the argumentMany readers will also be familiar with the activity99% of my gaming background comes from 90s video games, but I'm not going to use those as examples because relatively few readers will be familiar with those games. I could also use "modern" board games like Puerto Rico, Dominion, Terra Mystica, ASL etc., but the set of people who played in rated games is very low, which makes the argument less convincing (perhaps people who play in rated games are much worse than people who don't -- unlikely, but difficult to justify without comparing gameplay between rated and unrated games, which is pretty deep into weeds for this post).There are numerous activities that would be better to use than Overwatch, but I'm not familiar enough with them to use them as examples. For example, on reading a draft of this post, Kevin Burke noted that he's observed the same thing while coaching youth basketball and multiple readers noted that they've observed the same thing in chess, but I'm not familiar enough with youth basketball or chess to confidently say much about either activity even they'd be better examples because it's likely that more readers are familiar with basketball or chess than with Overwatch. [return]When I first started playing Overwatch (which is when I did that experiment), I ended up getting rated slightly above 50%-ile (for Overwatch players, that was in Plat -- this post is going to use percentiles and not ranks to avoid making non-Overwatch players have to learn what the ranks mean). It's generally believed and probably true that people who play the main ranked game mode in Overwatch are, on average, better than people who only play unranked modes, so it's likely that my actual percentile was somewhat higher than 50%-ile and that all "true" percentiles listed in this post are higher than the nominal percentiles.Some things you'll regularly see at slightly above 50%-ile are:Supports (healers) will heal someone who's at full health (which does nothing) while a teammate who's next to them is dying and then diesPlayers will not notice someone who walks directly behind the team and kills people one at a time until the entire team is killedPlayers will shoot an enemy until only one more shot is required to kill the enemy and then switch to a different target, letting the 1-health enemy heal back to full health before shooting at that enemy againAfter dying, players will not wait for their team to respawn and will, instead, run directly into the enemy team to fight them 1v6. This will repeat for the entire game (the game is designed to be 6v6, but in ranks below 95%-ile, it's rare to see a 6v6 engagement after one person on one team dies)Players will clearly have no idea what character abilities do, including for the character they're playingPlayers go for very high risk but low reward plays (for Overwatch players, a classic example of this is Rein going for a meme pin when the game opens on 2CP defense, very common at 50%-ile, rare at 95%-ile since players who think this move is a good idea tend to have generally poor decision making).People will have terrible aim and will miss four or five shots in a row when all they need to do is hit someone once to kill themIf a single flanking enemy threatens a healer who can't escape plus a non-healer with an escape ability, the non-healer will probably use their ability to run away, leaving the healer to die, even though they could easily kill the flanker and save their healer if they just attacked while being healed.Having just one aspect of your gameplay be merely bad instead of atrocious is enough to get to 50%-ile. For me, that was my teamwork, for others, it's other parts of their gameplay. The reason I'd say that my teamwork was bad and not good or even mediocre was that I basically didn't know how to play the game, didn't know what any of the characters strengths, weaknesses, and abilities are, so I couldn't possibly coordinate effectively with my team. I also didn't know how the game modes actually worked (e.g., under what circumstances the game will end in a tie vs. going into another round), so I was basically wandering around randomly with a preference towards staying near the largest group of teammates I could find. That's above average.You could say that someone is pretty good at the game since they're above average. But in a non-relative sense, being slightly above average is quite bad -- it's hard to argue that someone who doesn't notice their entire team being killed from behind while two teammates are yelling "[enemy] behind us!" over voice comms isn't bad.After playing a bit more, I ended up with what looks like a "true" rank of about 90%-ile when I'm using a character I know how to use. Due to volatility in ranking as well as matchmaking, I played in games as high as 98%-ile. My aim and dodging were still atrocious. Relative to my rank, my aim was actually worse than when I was playing in 50%-ile games since my opponents were much better and I was only a little bit better. In 90%-ile, two copies of myself would probably lose fighting against most people 2v1 in the open. I would also usually lose a fight if the opponent was in the open and I was behind cover such that only 10% of my character was exposed, so my aim was arguably more than 10x worse than median at my rank.My "trick" for getting to 90%-ile despite being a 1/10th aimer was learning how the game worked and playing in a way that maximized the probability of winning (to the best of my ability), as opposed to playing the game like it's an FFA game where your goal is to get kills as quickly as possible. It takes a bit more context to describe what this means in 90%-ile, so I'll only provide a couple examples, but these are representative of mistakes the vast majority of 90%-ile players are making all of the time (with the exception of a few players who have grossly defective aim, like myself, who make up for their poor aim by playing better than average for the rank in other ways).Within the game, the goal of the game is to win. There are different game modes, but for the mainline ranked game, they all will involve some kind of objective that you have to be on or near. It's very common to get into a situation where the round timer is ticking down to zero and your team is guaranteed to lose if no one on your team touches the objective but your team may win if someone can touch the objective and not die instantly (which will cause the game to go into overtime until shortly after both teams stop touching the objective). A concrete example of this that happens somewhat regularly is, the enemy team has four players on the objective while your team has two players near the objective, one tank and one support/healer. The other four players on your team died and are coming back from spawn. They're close enough that if you can touch the objective and not instantly die, they'll arrive and probably take the objective for the win, but they won't get there in time if you die immediately after taking the objective, in which case you'll lose.If you're playing the support/healer at 90%-ile to 95%-ile, this game will almost always end as follows: the tank will move towards the objective, get shot, decide they don't want to take damage, and then back off from the objective. As a support, you have a small health pool and will die instantly if you touch the objective because the other team will shoot you. Since your team is guaranteed to lose if you don't move up to the objective, you're forced to do so to have any chance of winning. After you're killed, the tank will either move onto the objective and die or walk towards the objective but not get there before time runs out. Either way, you'll probably lose.If the tank did their job and moved onto the objective before you died, you could heal the tank for long enough that the rest of your team will arrive and you'll probably win. The enemy team, if they were coordinated, could walk around or through the tank to kill you, but they won't do that -- anyone who knows that will cause them to win the game and can aim well enough to successfully follow through can't help but end up in a higher rank). And the hypothetical tank on your team who knows that it's their job to absorb damage for their support in that situation and not vice versa won't stay at 95%-ile very long because they'll win too many games and move up to a higher rank.Another basic situation that the vast majority of 90%-ile to 95%-ile players will get wrong is when you're on offense, waiting for your team to respawn so you can attack as a group. Even at 90%-ile, maybe 1/4 to 1/3 of players won't do this and will just run directly at the enemy team, but enough players will realize that 1v6 isn't a good idea that you'll often 5v6 or 6v6 fights instead of the constant 1v6 and 2v6 fights you see at 50%-ile. Anyway, while waiting for the team to respawn in order to get a 5v6, it's very likely one player who realizes that they shouldn't just run into the middle of the enemy team 1v6 will decide they should try to hit the enemy team with long-ranged attacks 1v6. People will do this instead of hiding in safety behind a wall even when the enemy has multiple snipers with instant-kill long range attacks. People will even do this against multiple snipers when they're playing a character that isn't a sniper and needs to hit the enemy 2-3 times to get a kill, making it overwhelmingly likely that they won't get a kill while taking a significant risk of dying themselves. For Overwatch players, people will also do this when they have full ult charge and the other team doesn't, turning a situation that should be to your advantage (your team has ults ready and the other team has used ults), into a neutral situation (both teams have ults) at best, and instantly losing the fight at worst.If you ever read an Overwatch forum, whether that's one of the reddit forums or the official Blizzard forums, a common complaint is "why are my teammates so bad? I'm at [90%-ile to 95%-ile rank], but all my teammates are doing obviously stupid game-losing things all the time, like [an example above]". The answer is, of course, that the person asking the question is also doing obviously stupid game-losing things all the time because anyone who doesn't constantly make makor blunders wins too much to stay at 95%-ile. This also applies to me.People will argue that players at this rank should be good because they're better than 95% of other players, which makes them relatively good. But non-relatively, it's hard to argue that someone who doesn't realize that you should step on the objective to probably win the game instead of not touching the objective for a sure loss is good. One of the most basic things about Overwatch is that it's an objective-based game, but the majority of players at 90%-ile to 95%-ile don't play that way.For anyone who isn't well into the 99%-ile, reviewing recorded games will reveal game-losing mistakes all the time. For myself, usually ranked 90%-ile or so, watching a recorded game will reveal tens of game losing mistakes in a close game (which is maybe 30% of losses, the other 70% are blowouts where there isn't a single simple mistake that decides the game).It's generally not too hard to fix these since the mistakes are like the example above: simple enough that once you see that you're making the mistake, the fix is straightforward because the mistake is straightforward. [return]There are probably some people who just want to be angry at their teammates. Due to how infrequently you get matched with the same players, it's hard to see this in the main rated game mode, but I think you can sometimes see this when Overwatch sometimes runs mini-rated modes.Mini-rated modes have a much smaller playerbase than the main rated mode, which has two notable side effects: players with a much wider variety of skill levels will be thrown into the same game and you'll see the same players over and over again if you play multiple games.Since you ended up matched with the same players repeatedly, you'll see players make the same mistakes and cause themselves to lose in the same way and then have the same tantrum and blame their teammates in the same way game after game.You'll also see tantrums and teammate blaming in the normal rated game mode, but when you see it, you generally can't tell if the person who's having a tantrum is just having a bad day or if it's some other one-off occurrence since, unless you're ranked very high or very low (where there's a smaller pool of closely rated players), you don't run into the same players all that frequently. But when you see a set of players in 15-20 games over the course of a few weeks and you see them lose the game for the same reason a double digit number of times followed by the exact same tantrum, you might start to suspect that some fraction of those people really want to be angry and that the main thing they're getting out of playing the game is a source of anger. You might also wonder about this from how some people use social media, but that's a topic for another post. [return]For example, there will also be players who have some kind of disability that prevents them from improving, but at the levels we're talking about, 99%-ile or below, that will be relatively rare (certainly well under 50%, and I think it's not unreasonable to guess that it's well under 10% of people who choose to play the game). IIRC, there's at least one player who's in the top 500 who's deaf (this is severely disadvantageous since sound cues give a lot of fine-grained positional information that cannot be obtained in any other way), at least one legally blind player who's 99%-ile, and multiple players with physical impairments that prevent them from having fine-grained control of a mouse, i.e., who are basically incapable of aiming, who are 99%-ile.There are also other kinds of reasons people might not improve. For example, Kevin Burke has noted that when he coaches youth basketball, some children don't want to do drills that they think make them look foolish (e.g., avoiding learning to dribble with their off hand even during drills where everyone is dribbling poorly because they're using their off hand). When I spent a lot of time in a climbing gym with a world class coach who would regularly send a bunch of kids to nationals and some to worlds, I'd observe the same thing in his classes -- kids, even ones who are nationally or internationally competitive, would sometimes avoid doing things because they were afraid it would make them look foolish to their peers. The coach's solution in those cases was to deliberately make the kid look extremely foolish and tell them that it's better to look stupid now than at nationals. [return]note that, here, a skilled coach is someone who is skilled at coaching, not necessarily someone who is skilled at the activity. People who are skilled at the activity but who haven't explicitly been taught how to teach or spent a lot of time working on teaching are generally poor coaches. [return]If you read the acknowledgements section of any of my posts, you can see that I get feedback from more than just two people on most posts (and I really appreciate the feedback), but I think that, by volume, well over 90% of the feedback I've gotten has come from Leah and a professional editor. [return]
When I ask people at trendy big tech companies why algorithms quizzes are mandatory, the most common answer I get is something like "we have so much scale, we can't afford to have someone accidentally write an O(n^2) algorithm and bring the site down"1. One thing I find funny about this is, even though a decent fraction of the value I've provided for companies has been solving phone-screen level algorithms problems on the job, I can't pass algorithms interviews! When I say that, people often think I mean that I fail half my interviews or something. It's more than half.When I wrote a draft blog post of my interview experiences, draft readers panned it as too boring and repetitive because I'd failed too many interviews. I should summarize my failures as a table because no one's going to want to read a 10k word blog post that's just a series of failures, they said (which is good advice; I'm working on a version with a table). I ve done maybe 40-ish "real" software interviews and passed maybe one or two of them (arguably zero)2.Let's look at a few examples to make it clear what I mean by "phone-screen level algorithms problem", above.At one big company I worked for, a team wrote a core library that implemented a resizable array for its own purposes. On each resize that overflowed the array's backing store, the implementation added a constant number of elements and then copied the old array to the newly allocated, slightly larger, array. This is a classic example of how not to implement a resizable array since it results in linear time resizing instead of amortized constant time resizing. It's such a classic example that it's often used as the canonical example when demonstrating amortized analysis.For people who aren't used to big tech company phone screens, typical phone screens that I've received are one of:an "easy" coding/algorithms question, maybe with a "very easy" warm-up question in front.a series of "very easy" coding/algorithms questions,a bunch of trivia (rare for generalist roles, but not uncommon for low-level or performance-related roles)This array implementation problem is considered to be so easy that it falls into the "very easy" category and is either a warm-up for the "real" phone screen question or is bundled up with a bunch of similarly easy questions. And yet, this resizable array was responsible for roughly 1% of all GC pressure across all JVM code at the company (it was the second largest source of allocations across all code) as well as a significant fraction of CPU. Luckily, the resizable array implementation wasn't used as a generic resizable array and it was only instantiated by a semi-special-purpose wrapper, which is what allowed this to "only" be responsible for 1% of all GC pressure at the company. If asked as an interview question, it's overwhelmingly likely that most members of the team would've implemented this correctly in an interview. My fixing this made my employer more money annually than I've made in my life.That was the second largest source of allocations, the number one largest source was converting a pair of long values to byte arrays in the same core library. It appears that this was done because someone wrote or copy pasted a hash function that took a byte array as input, then modified it to take two inputs by taking two byte arrays and operating on them in sequence, which left the hash function interface as (byte[], byte[]). In order to call this function on two longs, they used a handy long to byte[] conversion function in a widely used utility library. That function, in addition to allocating a byte[] and stuffing a long into it, also reverses the endianness of the long (the function appears to have been intended to convert long values to network byte order).Unfortunately, switching to a more appropriate hash function would've been a major change, so my fix for this was to change the hash function interface to take a pair of longs instead of a pair of byte arrays and have the hash function do the endianness reversal instead of doing it as a separate step (since the hash function was already shuffling bytes around, this didn't create additional work). Removing these unnecessary allocations made my employer more money annually than I've made in my life.Finding a constant factor speedup isn't technically an algorithms question, but it's also something you see in algorithms interviews. As a follow-up to an algorithms question, I commonly get asked "can you make this faster?" The answer is to these often involves doing a simple optimization that will result in a constant factor improvement.A concrete example that I've been asked twice in interviews is: you're storing IDs as ints, but you already have some context in the question that lets you know that the IDs are densely packed, so you can store them as a bitfield instead. The difference between the bitfield interview question and the real-world superfluous array is that the real-world existing solution is so far afield from the expected answer that you probably wouldn t be asked to find a constant factor speedup. More likely, you would've failed the interview at that point.To pick an example from another company, the configuration for BitFunnel, a search index used in Bing, is another example of an interview-level algorithms question3.The full context necessary to describe the solution is a bit much for this blog post, but basically, there's a set of bloom filters that needs to be configured. One way to do this (which I'm told was being done) is to write a black-box optimization function that uses gradient descent to try to find an optimal solution. I'm told this always resulted in some strange properties and the output configuration always resulted in non-idealities which were worked around by making the backing bloom filters less dense, i.e. throwing more resources (and therefore money) at the problem.To create a more optimized solution, you can observe that the fundamental operation in BitFunnel is equivalent to multiplying probabilities together, so, for any particular configuration, you can just multiply some probabilities together to determine how a configuration will perform. Since the configuration space isn't all that large, you can then put this inside a few for loops and iterate over the space of possible configurations and then pick out the best set of configurations. This isn't quite right because multiplying probabilities assumes a kind of independence that doesn't hold in reality, but that seems to work ok for the same reason that naive Bayesian spam filtering worked pretty well when it was introduced even though it incorrectly assumes the probability of any two words appearing in an email are independent. And if you want the full solution, you can work out the non-independent details, although that's probably beyond the scope of an interview.Those are just three examples that came to mind, I run into this kind of thing all the time and could come up with tens of examples off the top of my head, perhaps more than a hundred if I sat down and tried to list every example I've worked on, certainly more than a hundred if I list examples I know of that someone else (or no one) has worked on. Both the examples in this post as well as the ones I haven t included have these properties:The example could be phrased as an interview questionIf phrased as an interview question, you'd expect most (and probably) all people on the relevant team to get the right answer in the timeframe of an interviewThe cost savings from fixing the example is worth more annually than my lifetime earnings to dateThe example persisted for long enough that it's reasonable to assume that it wouldn't have been discovered otherwiseAt the start of this post, we noted that people at big tech companies commonly claim that they have to do algorithms interviews since it's so costly to have inefficiencies at scale. My experience is that these examples are legion at every company I've worked for that does algorithms interviews. Trying to get people to solve algorithms problems on the job by asking algorithms questions in interviews doesn't work.One reason is that even though big companies try to make sure that the people they hire can solve algorithms puzzles they also incentivize many or most developers to avoid deploying that kind of reasoning to make money.Of the three solutions for the examples above, two are in production and one isn't. That's about my normal hit rate if I go to a random team with a diff and don't persistently follow up (as opposed to a team that I have reason to believe will be receptive, or a team that's asked for help, or if I keep pestering a team until the fix gets taken).If you're very cynical, you could argue that it's surprising the success rate is that high. If I go to a random team, it's overwhelmingly likely that efficiency is in neither the team's objectives or their org's objectives. The company is likely to have spent a decent amount of effort incentivizing teams to hit their objectives -- what's the point of having objectives otherwise? Accepting my diff will require them to test, integrate, deploy the change and will create risk (because all deployments have non-zero risk). Basically, I'm asking teams to do some work and take on some risk to do something that's worthless to them. Despite incentives, people will usually take the diff, but they're not very likely to spend a lot of their own spare time trying to find efficiency improvements(and their normal work time will be spent on things that are aligned with the team's objectives)4.Hypothetically, let's say a company didn't try to ensure that its developers could pass algorithms quizzes but did incentivize developers to use relatively efficient algorithms. I don't think any of the three examples above could have survived, undiscovered, for years nor could they have remained unfixed. Some hypothetical developer working at a company where people profile their code would likely have looked at the hottest items in the profile for the most computationally intensive library at the company. The "trick" for both isn't any kind of algorithms wizardry, it's just looking at all, which is something incentives can fix. The third example is less inevitable since there isn't a standard tool that will tell you to look at the problem. It would also be easy to try to spin the result as some kind of wizardry -- that example formed the core part of a paper that won "best paper award" at the top conference in its field (IR), but the reality is that the "trick" was applying high school math, which means the real trick was having enough time to look at places where high school math might be applicable to find one.I actually worked at a company that used the strategy of "don't ask algorithms questions in interviews, but do incentivize things that are globally good for the company". During my time there, I only found one single fix that nearly meets the criteria for the examples above (if the company had more scale, it would've met all of the criteria, but due to the company's size, increases in efficiency were worth much less than at big companies -- much more than I was making at the time, but the annual return was still less than my total lifetime earnings to date).I think the main reason that I only found one near-example is that enough people viewed making the company better as their job, so straightforward high-value fixes tended not exist because systems were usually designed such that they didn't really have easy to spot improvements in the first place. In the rare instances where that wasn't the case, there were enough people who were trying to do the right thing for the company (instead of being forced into obeying local incentives that are quite different from what's globally beneficial to the company) that someone else was probably going to fix the issue before I ever ran into it.The algorithms/coding part of that company's interview (initial screen plus onsite combined) was easier than the phone screen at major tech companies and we basically didn't do a system design interview.For a while, we tried an algorithmic onsite interview question that was on the hard side but in the normal range of what you might see in a BigCo phone screen (but still easier than you'd expect to see at an onsite interview). We stopped asking the question because every new grad we interviewed failed the question (we didn't give experienced candidates that kind of question). We simply weren't prestigious enough to get candidates who can easily answer those questions, so it was impossible to hire using the same trendy hiring filters that everybody else had. In contemporary discussions on interviews, what we did is often called "lowering the bar", but it's unclear to me why we should care how high of a bar someone can jump over when little (and in some cases none) of the job they're being hired to do involves jumping over bars. And, in the cases where you do want them to jump over bars, they're maybe 2" high and can easily be walked over.When measured on actual productivity, that was the most productive company I've worked for. I believe the reasons for that are cultural and too complex to fully explore in this post, but I think it helped that we didn't filter out perfectly good candidates with algorithms quizzes and assumed people could pick that stuff up on the job if we had a culture of people generally doing the right thing instead of focusing on local objectives.If other companies want people to solve interview-level algorithms problems on the job perhaps they could try incentivizing people to solve algorithms problems (when relevant). That could be done in addition to or even instead of filtering for people who can whiteboard algorithms problems.Appendix: how did we get here?Way back in the day, interviews often involved "trivia" questions. Modern versions of these might look like the following:What's MSI? MESI? MOESI? MESIF? What's the advantage of MESIF over MOESI?What happens when you throw in a destructor? What if it's C++11? What if a sub-object's destructor that's being called by a top-level destructor throws, which other sub-object destructors will execute? What if you throw during stack unwinding? Under what circumstances would that not cause std::terminate to get called?I heard about this practice back when I was in school and even saw it with some "old school" companies. This was back when Microsoft was the biggest game in town and people who wanted to copy a successful company were likely to copy Microsoft. The most widely read programming blogger at the time (Joel Spolsky) was telling people they need to adopt software practice X because Microsoft was doing it and they couldn't compete without adopting the same practices. For example, in one of the most influential programming blog posts of the era, Joel Spolsky advocates for what he called the Joel test in part by saying that you have to do these things to keep up with companies like Microsoft:A score of 12 is perfect, 11 is tolerable, but 10 or lower and you ve got serious problems. The truth is that most software organizations are running with a score of 2 or 3, and they need serious help, because companies like Microsoft run at 12 full-time.At the time, popular lore was that Microsoft asked people questions like the following (and I was actually asked one of these brainteasers during my on interview with Microsoft around 2001, along with precisely zero algorithms or coding questions):how would you escape from a blender if you were half an inch tall?why are manhole covers round?a windowless room has 3 lights, each of which is controlled by a switch outside of the room. You are outside the room. You can only enter the room once. How can you determine which switch controls which lightbulb?Since I was interviewing during the era when this change was happening, I got asked plenty of trivia questions as well plenty of brainteasers (including all of the above brainteasers). Some other questions that aren't technically brainteasers that were popular at the time were Fermi problems. Another trend at the time was for behavioral interviews and a number of companies I interviewed with had 100% behavioral interviews with zero technical interviews.Anyway, back then, people needed a rationalization for copying Microsoft-style interviews. When I asked people why they thought brainteasers or Fermi questions were good, the convenient rationalization people told me was usually that they tell you if a candidate can really think, unlike those silly trivia questions, which only tell if you people have memorized some trivia. What we really need to hire are candidates who can really think!Looking back, people now realize that this wasn't effective and cargo culting Microsoft's every decision won't make you as successful as Microsoft because Microsoft's success came down to a few key things plus network effects, so copying how they interview can't possibly turn you into Microsoft. Instead, it's going to turn you into a company that interviews like Microsoft but isn't in a position to take advantage of the network effects that Microsoft was able to take advantage of.For interviewees, the process with brainteasers was basically as it is now with algorithms questions, except that you'd review How Would You Move Mount Fuji before interviews instead of Cracking the Coding Interview to pick up a bunch of brainteaser knowledge that you'll never use on the job instead of algorithms knowledge you'll never use on the job.Back then, interviewers would learn about questions specifically from interview prep books like "How Would You Move Mount Fuji?" and then ask them to candidates who learned the answers from books like "How Would You Move Mount Fuji?". When I talk to people who are ten years younger than me, they think this is ridiculous -- those questions obviously have nothing to do the job and being able to answer them well is much more strongly correlated with having done some interview prep than being competent at the job. Hillel Wayne has discussed how people come up with interview questions today (and I've also seen it firsthand at a few different companies) and, outside of groups that are testing for knowledge that's considered specialized, it doesn't seem all that different today.At this point, we've gone through a few decades of programming interview fads, each one of which looks ridiculous in retrospect. Either we've finally found the real secret to interviewing effectively and have reasoned our way past whatever roadblocks were causing everybody in the past to use obviously bogus fad interview techniques, or we're in the middle of another fad, one which will seem equally ridiculous to people looking back a decade or two from now.Without knowing anything about the effectiveness of interviews, at a meta level, since the way people get interview techniques is the same (crib the high-level technique from the most prestigious company around), I think it would be pretty surprising if this wasn't a fad. I would be less surprised to discover that current techniques were not a fad if people were doing or referring to empirical research or had independently discovered what works.Inspired by a comment by Wesley Aptekar-Cassels, the last time I was looking for work, I asked some people how they checked the effectiveness of their interview process and how they tried to reduce bias in their process. The answers I got (grouped together when similar, in decreasing order of frequency were):Huh? We don't do that and/or why would we do that?We don't really know if our process is effectiveI/we just know that it worksI/we aren't biasedI/we would notice bias if it existed, which it doesn'tSomeone looked into it and/or did a study, but no one who tells me this can ever tell me anything concrete about how it was looked into or what the study's methodology wasAppendix: trainingAs with most real world problems, when trying to figure out why seven, eight, or even nine figure per year interview-level algorithms bugs are lying around waiting to be fixed, there isn't a single "root cause" you can point to. Instead, there's a kind of hedgehog defense of misaligned incentives. Another part of this is that training is woefully underappreciated.We've discussed that, at all but one company I've worked for, there are incentive systems in place that cause developers to feel like they shouldn't spend time looking at efficiency gains even when a simple calculation shows that there are tens or hundreds of millions of dollars in waste that could easily be fixed. And then because this isn't incentivized, developers tend to not have experience doing this kind of thing, making it unfamiliar, which makes it feel harder than it is. So even when a day of work could return $1m/yr in savings or profit (quite common at large companies, in my experience), people don't realize that it's only a day of work and could be done with only a small compromise to velocity. One way to solve this latter problem is with training, but that's even harder to get credit for than efficiency gains that aren't in your objectives!Just for example, I once wrote a moderate length tutorial (4500 words, shorter than this post by word count, though probably longer if you add images) on how to find various inefficiences (how to use an allocation or CPU time profiler, how to do service-specific GC tuning for the GCs we use, how to use some tooling I built that will automatically find inefficiencies in your JVM or container configs, etc., basically things that are simple and often high impact that it's easy to write a runbook for; if you're at Twitter, you can read this at http://go/easy-perf). I've had a couple people who would've previously come to me for help with an issue tell me that they were able to debug and fix an issue on their own and, secondhand, I heard that a couple other people who I don't know were able to go off and increase the efficiency of their service. I'd be surprised if I ve heard about even 10% of cases where this tutorial helped someone, so I'd guess that this has helped tens of engineers, and possibly quite a few more.If I'd spent a week doing "real" work instead of writing a tutorial, I'd have something concrete, with quantifiable value, that I could easily put into a promo packet or performance review. Instead, I have this nebulous thing that, at best, counts as a bit of "extra credit". I'm not complaining about this in particular -- this is exactly the outcome I expected. But, on average, companies get what they incentivize. If they expect training to come from developers (as opposed to hiring people to produce training materials, which tends to be very poorly funded compared to engineering) but don't value it as much as they value dev work, then there's going to be a shortage of training.I believe you can also see training under-incentivized in public educational materials due to the relative difficulty of monetizing education and training. If you want to monetize explaining things, there are a few techniques that seem to work very well. If it's something that's directly obviously valuable, selling a video course that's priced "very high" (hundreds or thousands of dollars for a short course) seems to work. Doing corporate training, where companies fly you in to talk to a room of 30 people and you charge $3k per head also works pretty well.If you want to reach (and potentially help) a lot of people, putting text on the internet and giving it away works pretty well, but monetization for that works poorly. For technical topics, I'm not sure the non-ad-blocking audience is really large enough to monetize via ads (as opposed to a pay wall).Just for example, Julia Evans can support herself from her zine income, which she's said has brought in roughly $100k/yr for the past two years. Someone who does very well in corporate training can pull that in with a one or two day training course and, from what I've heard of corporate speaking rates, some highly paid tech speakers can pull that in with two engagements. Those are significantly above average rates, especially for speaking engagements, but since we're comparing to Julia Evans, I don't think it's unfair to use an above average rate.Appendix: misaligned incentive hedgehog defense, part 3Of the three examples above, I found one on a team where it was clearly worth zero to me to do anything that was actually valuble to the company and the other two on a team where it valuable to me to do things that were good for the company, regardless of what they were. In my experience, that's very unusual for a team at a big company, but even on that team, incentive alignment was still quite poor. At one point, after getting a promotion and a raise, I computed the ratio of the amount of money my changes made the company vs. my raise and found that my raise was 0.03% of the money that I made the company, only counting easily quantifiable and totally indisputable impact to the bottom line. The vast majority of my work was related to tooling that had a difficult to quantify value that I suspect was actually larger than the value of the quantifiable impact, so I probably recieved well under 0.01% of the marginal value I was producing. And that's really an overestimate of how much I was incentivized I was to do the work -- at the margin, I strongly suspect that anything I did was worth zero to me. After the first $10m/yr or maybe $20m/yr, there's basically no difference in terms of performance reviews, promotions, raises, etc. Because there was no upside to doing work and there's some downside (could get into a political fight, could bring the site down, etc.), the marginal return to me of doing more than "enough" work was probably negative.Some companies will give very large out-of-band bonuses to people regularly, but that work wasn't for a company that does a lot of that, so there's nothing the company could do to indicate that it valued additional work once someone did "enough" work to get the best possible rating on a performance review. From a mechanism design point of view, the company was basically asking employees to stop working once they did "enough" work for the year.So even on this team, which was relatively well aligned with the company's success compared to most teams, the company's compensation system imposed a low ceiling on how well the team could be aligned.This also happened in another way. As is common at a lot of companies, managers were given a team-wide budget for raises that was mainly a function of headcount, that was then doled out to team members in a zero-sum way. Unfortunately for each team member (at least in terms of compensation), the team pretty much only had productive engineers, meaning that no one was going to do particularly well in the zero-sum raise game. The team had very low turnover because people like working with good co-workers, but the company was applying one the biggest levers it has, compensation, to try to get people to leave the team and join less effective teams.Because this is such a common setup, I've heard of managers at multiple companies who try to retain people who are harmless but ineffective to try to work around this problem. If you were to ask someone, abstractly, if the company wants to hire and retain people who are ineffective, I suspect they'd tell you no. But insofar as a company can be said to want anything, it wants what it incentivizes.RelatedDownsides of cargo-culting trendy hiring practicesNormalization of devianceZvi Mowshowitz's on Moral Mazes, a book about how corporations have systemic issues that cause misaligned incentives at every level"randomsong" on on how it's possible to teach almost anybody to program. thematically related, the idea being that programming isn't as hard as a lot of programmers would like to believeTanya Reilly on how "glue work" is poorly incentivized, training being poorly incentivized is arguably a special case of thisThomas Ptacek on using hiring filters that are decently correlated with job performanceMichael Lynch on his personal experience of big company incentivesAn anonymous HN commenter on doing almost no work at Google, they say about 10% capacity, for six years and getting promotedThanks to Leah Hanson, Heath Borders, Lifan Zeng, Justin Findlay, Kevin Burke, @chordowl, Peter Alexander, Niels Olson, Kris Shamloo, Chip Thien, and Solomon Boulos for comments/corrections/discussionFor one thing, most companies that copy the Google interview don't have that much scale. But even for companies that do, most people don't have jobs where they're designing high-scale algorithms (maybe they did at Google circa 2003, but from what I've seen at three different big tech companies, most people's jobs are pretty light on algorithms work). [return]Real is in quotes because I've passed a number of interviews for reasons outside of the interview process. Maybe I had a very strong internal recommendation that could override my interview performance, maybe someone read my blog and assumed that I can do reasonable work based on my writing, maybe someone got a backchannel reference from a former co-worker of mine, or maybe someone read some of my open source code and judged me on that instead of a whiteboard coding question (and as far as I know, that last one has only happened once or twice). I'll usually ask why I got a job offer in cases where I pretty clearly failed the technical interview, so I have a collection of these reasons from folks.The reason it's arguably zero is that the only software interview where I inarguably got a "real" interview and was coming in cold was at Google, but that only happened because the interviewers that were assigned interviewed me for the wrong ladder -- I was interviewing for a hardware position, but I was being interviewed by software folks, so I got what was basically a standard software interview except that one interviewer asked me some questions about state machine and cache coherence (or something like that). After they realized that they'd interviewed me for the wrong ladder, I had a follow-up phone interview from a hardware engineer to make sure I wasn't totally faking having worked at a hardware startup from 2005 to 2013. It's possible that I failed the software part of the interview and was basically hired on the strength of the follow-up phone screen.Note that this refers only to software -- I'm actually pretty good at hardware interviews. At this point, I'm pretty out of practice at hardware and would probably need a fair amount of time to ramp up on an actual hardware job, but the interviews are a piece of cake for me. One person who knows me pretty well thinks this is because I "talk like a hardware engineer" and both say things that make hardware folks think I'm legit as well as say things that sound incredibly stupid to most programmers in a way that's more abbout shibboleths than actual knowledge or skills. [return]This one is a bit harder than you'd expect to get in a phone screen, but it wouldn't be out of line in an onsite interview (although a friend of mine once got a Google Code Jam World Finals question in a phone interview with Google, so you might get something this hard or harder, depending on who you draw as an interviewer).BTW, if you're wondering what my friend did when they got that question, it turns out they actually knew the answer because they'd seen and attempted the problem during Google Code Jam. They didn't get the right answer at the time, but they figured it out later just for fun. However, my friend didn't think it was reasonable to give that as a phone screen questions and asked the interviewer for another question. The interviewer refused, so my friend failed the phone screen. At the time, I doubt there were more than a few hundred people in the world who would've gotten the right answer to the question in a phone screen and almost all of them probably would've realized that it was an absurd phone screen question. After failing the interview, my friend ended up looking for work for almost six months before passing an interview for a startup where he ended up building a number of core systems (in terms of both business impact and engineering difficulty). My friend is still there after the mid 10-figure IPO -- the company understands how hard it would be to replace this person and treats them very well. None of the other companies that interviewed this person even wanted to hire them at all and they actually had a hard time getting a job. [return]Outside of egregious architectural issues that will simply cause a service to fall over, the most common way I see teams fix efficiency issues is to ask for more capacity. Some companies try to counterbalance this in some way (e.g., I've heard that at FB, a lot of the teams that work on efficiency improvements report into the capacity org, which gives them the ability to block capacity requests if they observe that a team has extreme inefficiencies that they refuse to fix), but I haven't personally worked in an environment where there's an effective system fix to this. Google had a system that was intended to address this problem that, among other things, involved making headcount fungible with compute resources, but I've heard that was rolled back in favor of a more traditional system for reasons. [return]
This is a psuedo-transcript for a talk given at Deconstruct 2019. To make this accessible for people on slow connections as well as people using screen readers, the slides have been replaced by in-line text (the talk has ~120 slides; at an average of 20 kB per slide, that's 2.4 MB. If you think that's trivial, consider that half of Americans still aren't on broadband and the situation is much worse in developing countries.Let's talk about files! Most developers seem to think that files are easy. Just for example, let's take a look at the top reddit r/programming comments from when Dropbox announced that they were only going to support ext4 on Linux (the most widely used Linux filesystem). For people not familiar with reddit r/programming, I suspect r/programming is the most widely read English language programming forum in the world.The top comment reads:I'm a bit confused, why do these applications have to support these file systems directly? Doesn't the kernel itself abstract away from having to know the lower level details of how the files themselves are stored?The only differences I could possibly see between different file systems are file size limitations and permissions, but aren't most modern file systems about on par with each other?The #2 comment (and the top replies going two levels down) are:#2: Why does an application care what the filesystem is?#2: Shouldn't that be abstracted as far as "normal apps" are concerned by the OS?Reply: It's a leaky abstraction. I'm willing to bet each different FS has its own bugs and its own FS specific fixes in the dropbox codebase. More FS's means more testing to make sure everything works right . . .2nd level reply: What are you talking about? This is a dropbox, what the hell does it need from the FS? There are dozenz of fssync tools, data transfer tools, distributed storage software, and everything works fine with inotify. What the hell does not work for dropbox exactly?another 2nd level reply: Sure, but any bugs resulting from should be fixed in the respective abstraction layer, not by re-implementing the whole stack yourself. You shouldn't re-implement unless you don't get the data you need from the abstraction. . . . DropBox implementing FS-specific workarounds and quirks is way overkill. That's like vim providing keyboard-specific workarounds to avoid faulty keypresses. All abstractions are leaky - but if no one those abstractions, nothing will ever get done (and we'd have billions of "operating systems").In this talk, we're going to look at how file systems differ from each other and other issues we might encounter when writing to files. We're going to look at the file "stack" starting at the top with the file API, which we'll see is nearly impossible to use correctly and that supporting multiple filesystems without corrupting data is much harder than supporting a single filesystem; move down to the filesystem, which we'll see has serious bugs that cause data loss and data corruption; and then we'll look at disks and see that disks can easily corrupt data at a rate five million times greater than claimed in vendor datasheets.File APIWriting one fileLet's say we want to write a file safely, so that we don't want to get data corruption. For the purposes of this talk, this means we'd like our write to be "atomic" -- our write should either fully complete, or we should be able to undo the write and end up back where we started. Let's look at an example from Pillai et al., OSDI 14.We have a file that contains the text a foo and we want to overwrite foo with bar so we end up with a bar. We're going to make a number of simplifications. For example, you should probably think of each character we're writing as a sector on disk (or, if you prefer, you can imagine we're using a hypothetical advanced NVM drive). Don't worry if you don't know what that means, I'm just pointing this out to note that this talk is going to contain many simplifications, which I'm not going to call out because we only have twenty-five minutes and the unsimplified version of this talk would probably take about three hours.To write, we might use the pwrite syscall. This is a function provided by the operating system to let us interact with the filesystem. Our invocation of this syscall looks like:pwrite( [file], bar , // data to write 3, // write 3 bytes 2) // at offset 2pwrite takes the file we're going to write, the data we want to write, bar, the number of bytes we want to write, 3, and the offset where we're going to start writing, 2. If you're used to using a high-level language, like Python, you might be used to an interface that looks different, but underneath the hood, when you write to a file, it's eventually going to result in a syscall like this one, which is what will actually write the data into a file.If we just call pwrite like this, we might succeed and get a bar in the output, or we might end up doing nothing and getting a foo, or we might end up with something in between, like a boo, a bor, etc.What's happening here is that we might crash or lose power when we write. Since pwrite isn't guaranteed to be atomic, if we crash, we can end up with some fraction of the write completing, causing data corruption. One way to avoid this problem is to store an "undo log" that will let us restore corrupted data. Before we're modify the file, we'll make a copy of the data that's going to be modified (into the undo log), then we'll modify the file as normal, and if nothing goes wrong, we'll delete the undo log.If we crash while we're writing the undo log, that's fine -- we'll see that the undo log isn't complete and we know that we won't have to restore because we won't have started modifying the file yet. If we crash while we're modifying the file, that's also ok. When we try to restore from the crash, we'll see that the undo log is complete and we can use it to recover from data corruption:creat(/d/log) // Create undo logwrite(/d/log, "2,3,foo", 7) // To undo, at offset 2, write 3 bytes, "foo"pwrite(/d/orig, bar", 3, 2) // Modify original file as beforeunlink(/d/log) // Delete log fileIf we're using ext3 or ext4, widely used Linux filesystems, and we're using the mode data=journal (we'll talk about what these modes mean later), here are some possible outcomes we could get:d/log: "2,3,f"d/orig: "a foo"d/log: ""d/orig: "a foo"It's possible we'll crash while the log file write is in progress and we'll have an incomplete log file. In the first case above, we know that the log file isn't complete because the file says we should start at offset 2 and write 3 bytes, but only one byte, f, is specified, so the log file must be incomplete. In the second case above, we can tell the log file is incomplete because the undo log format should start with an offset and a length, but we have neither. Either way, since we know that the log file isn't complete, we know that we don't need to restore.Another possible outcome is something like:d/log: "2,3,foo"d/orig: "a boo"d/log: "2,3,foo"d/orig: "a bar"In the first case, the log file is complete we crashed while writing the file. This is fine, since the log file tells us how to restore to a known good state. In the second case, the write completed, but since the log file hasn't been deleted yet, we'll restore from the log file.If we're using ext3 or ext4 with data=ordered, we might see something like:d/log: "2,3,fo"d/orig: "a boo"d/log: ""d/orig: "a bor"With data=ordered, there's no guarantee that the write to the log file and the pwrite that modifies the original file will execute in program order. Instesad, we could getcreat(/d/log) // Create undo logpwrite(/d/orig, bar", 3, 2) // Modify file before writing undo log!write(/d/log, "2,3,foo", 7) // Write undo logunlink(/d/log) // Delete log fileTo prevent this re-ordering, we can use another syscall, fsync. fsync is a barrier (prevents re-ordering) and it flushes caches (which we'll talk about later).creat(/d/log)write(/d/log, 2,3,foo , 7)fsync(/d/log) // Add fsync to prevent re-orderingpwrite(/d/orig, bar , 3, 2)fsync(/d/orig) // Add fsync to prevent re-orderingunlink(/d/log)This works with ext3 or ext4, data=ordered, but if we use data=writeback, we might see something like:d/log: "2,3,WAT"d/orig: "a boo"Unfortunately, with data=writeback, the write to the log file isn't guaranteed to be atomic and the filesystem metadata that tracks the file length can get updated before we've finished writing the log file, which will make it look like the log file contains whatever bits happened to be on disk where the log file was created. Since the log file exists, when we try to restore after a crash, we may end up "restoring" random garbage into the original file. To prevent this, we can add a checksum (a way of making sure the file is actually valid) to the log file.creat(/d/log)write(/d/log, [ ],foo ,7) // Add checksum to log file to detect incomplete log filefsync(/d/log)pwrite(/d/orig, bar , 3, 2)fsync(/d/orig)unlink(/d/log)This should work with data=writeback, but we could still see the following:d/orig: "a boo"There's no log file! Although we created a file, wrote to it, and then fsync'd it. Unfortunately, there's no guarantee that the directory will actually store the location of the file if we crash. In order to make sure we can easily find the file when we restore from a crash, we need to fsync the parent of the newly created log.creat(/d/log)write(/d/log, [ ],foo ,7)fsync(/d/log)fsync(/d) /// fsync parent directorypwrite(/d/orig, bar , 3, 2)fsync(/d/orig)unlink(/d/log)There are a couple more things we should do. We shoud also fsync after we're done (not shown), and we also need to check for errors. These syscalls can return errors and those errors need to be handled appropriately. There's at least one filesystem issue that makes this very difficult, but since that's not an API usage thing per se, we'll look at this again in the Filesystems section.We've now seen what we have to do to write a file safely. It might be more complicated than we like, but it seems doable -- if someone asks you to write a file in a self-contained way, like an interview question, and you know the appropriate rules, you can probably do it correctly. But what happens if we have to do this as a day-to-day part of our job, where we'd like to write to files safely every time to write to files in a large codebase.API in practicePillai et al., OSDI 14 looked at a bunch of software that writes to files, including things we'd hope write to files safely, like databases and version control systems: Leveldb, LMDB, GDBM, HSQLDB, Sqlite, PostgreSQL, Git, Mercurial, HDFS, Zookeeper. They then wrote a static analysis tool that can find incorrect usage of the file API, things like incorrectly assuming that operations that aren't atomic are actually atomic, incorrectly assuming that operations that can be re-ordered will execute in program order, etc.When they did this, they found that every single piece of software they tested except for SQLite in one particular mode had at least one bug. This isn't a knock on the developers of this software or the software -- the programmers who work on things like Leveldb, LBDM, etc., know more about filesystems than the vast majority programmers and the software has more rigorous tests than most software. But they still can't use files safely every time! A natural follow-up to this is the question: why the file API so hard to use that even experts make mistakes?Concurrent programming is hardThere are a number of reasons for this. If you ask people "what are hard problems in programming?", you'll get answers like distributed systems, concurrent programming, security, aligning things with CSS, dates, etc.And if we look at what mistakes cause bugs when people do concurrent programming, we see bugs come from things like "incorrectly assuming operations are atomic" and "incorrectly assuming operations will execute in program order". These things that make concurrent programming hard also make writing files safely hard -- we saw examples of both of these kinds of bugs in our first example. More generally, many of the same things that make concurrent programming hard are the same things that make writing to files safely hard, so of course we should expect that writing to files is hard!Another property writing to files safely shares with concurrent programming is that it's easy to write code that has infrequent, non-deterministc failures. With respect to files, people will sometimes say this makes things easier ("I've never noticed data corruption", "your data is still mostly there most of the time", etc.), but if you want to write files safely because you're working on software that shouldn't corrupt data, this makes things more difficult by making it more difficult to tell if your code is really correct.API inconsistentAs we saw in our first example, even when using one filesystem, different modes may have significantly different behavior. Large parts of the file API look like this, where behavior varies across filesystems or across different modes of the same filesystem. For example, if we look at mainstream filesystems, appends are atomic, except when using ext3 or ext4 with data=writeback, or ext2 in any mode and directory operations can't be re-ordered w.r.t. any other operations, except on btrfs. In theory, we should all read the POSIX spec carefully and make sure all our code is valid according to POSIX, but if they check filesystem behavior at all, people tend to code to what their filesystem does and not some abtract spec.If we look at one particular mode of one filesystem (ext4 with data=journal), that seems relatively possible to handle safely, but when writing for a variety of filesystems, especially when handling filesystems that are very different from ext3 and ext4, like btrfs, it becomes very difficult for people to write correct code.Docs unclearIn our first example, we saw that we can get different behavior from using different data= modes. If we look at the manpage (manual) on what these modes mean in ext3 or ext4, we get:journal: All data is committed into the journal prior to being written into the main filesystem.ordered: This is the default mode. All data is forced directly out to the main file system prior to its metadata being committed to the journal.writeback: Data ordering is not preserved data may be written into the main filesystem after its metadata has been committed to the journal. This is rumoured to be the highest-throughput option. It guarantees internal filesystem integrity, however it can allow old data to appear in files after a crash and journal recovery.If you want to know how to use your filesystem safely, and you don't already know what a journaling filesystem is, this definitely isn't going to help you. If you know what a journaling filesystem is, this will give you some hints but it's still not sufficient. It's theoretically possible to figure everything out from reading the source code, but this is pretty impractical for most people who don't already know how the filesystem works.For English-language documentation, there's lwn.net and the Linux kernel mailing list (LKML). LWN is great, but they can't keep up with everything, so you LKML is the place to go if you want something comprehensive. Here's an example of an exchange on LKML about filesystems:Dev 1: Personally, I care about metadata consistency, and ext3 documentation suggests that journal protects its integrity. Except that it does not on broken storage devices, and you still need to run fsck there.Dev 2: as the ext3 authors have stated many times over the years, you still need to run fsck periodically anyway.Dev 1: Where is that documented?Dev 2: linux-kernel mailing list archives.FS dev: Probably from some 6-8 years ago, in e-mail postings that I made.While the filesystem developers tend to be helpful and they write up informative responses, most people probably don't keep up with the past 6-8 years of LKML.Performance / correctness conflictAnother issue is that the file API has an inherent conflict between performance and correctness. We noted before that fsync is a barrier (which we can use to enforce ordering) and that it flushes caches. If you've ever worked on the design of a high-performance cache, like a microprocessor cache, you'll probably find the bundling of these two things into a single primitive to be unusual. A reason this is unusual is that flushing caches has a significant performance cost and there are many cases where we want to enforce ordering without paying this performance cost. Bundling these two things into a single primitive forces us to pay the cache flush cost when we only care about ordering.Chidambaram et al., SOSP 13 looked at the performance cost of this by modifying ext4 to add a barrier mechanism that doesn't flush caches and they found that, if they modified software appropriately and used their barrier operation where a full fsync wasn't necessary, they were able to achieve performance roughly equivalent to ext4 with cache flushing entirely disabled (which is unsafe and can lead to data corruption) without sacrificing safety. However, making your own filesystem and getting it adopted is impractical for most people writing user-level software. Some databases will bypass the filesystem entirely or almost entirely, but this is also impractical for most software.That's the file API. Now that we've seen that it's extraordinarily difficult to use, let's look at filesystems.FilesystemIf we want to make sure that filessystems work, one of the most basic tests we could do is to inject errors are the layer below the filesystem to see if the filesystem handles them properly. For example, on a write, we could have the disk fail to write the data and return the appropriate error. If the filesystem drops this error or doesn't handle ths properly, that means we have data loss or data corruption. This is analogous to the kinds of distributed systems faults Kyle Kingsbury talked about in his distributed systems testing talk yesterday (although these kinds of errors are much more straightforward to test).Prabhakaran et al., SOSP 05 did this and found that, for most filesystems tested, almost all write errors were dropped. The major exception to this was on ReiserFS, which did a pretty good job with all types of errors tested, but ReiserFS isn't really used today for reasons beyond the scope of this talk.We (Wesley Aptekar-Cassels and I) looked at this again in 2017 and found that things had improved significantly. Most filesystems (other than JFS) could pass these very basic tests on error handling.Another way to look for errors is to look at filesystems code to see if it handles internal errors correctly. Gunawai et al., FAST 08 did this and found that internal errors were dropped a significant percentage of the time. The technique they used made it difficult to tell if functions that could return many different errors were correctly handling each error, so they also looked at calls to functions that can only return a single error. In those cases, depending on the function, errors were dropped roughly 2/3 to 3/4 of the time, depending on the function.Wesley and I also looked at this again in 2017 and found significant improvement -- errors for the same functions Gunawi et al. looked at were "only" ignored 1/3 to 2/3 of the time, depending on the function.Gunawai et al. also looked at comments near these dropped errors and found comments like "Just ignore errors at this point. There is nothing we can do except to try to keep going." (XFS) and "Error, skip block and hope for the best." (ext3).Now we've seen that while filesystems used to drop even the most basic errors, they now handle then correctly, but there are some code paths where errors can get dropped. For a concrete example of a case where this happens, let's look back at our first example. If we get an error on fsync, unless we have a pretty recent Linux kernel (Q2 2018-ish), there's a pretty good chance that the error will be dropped and it may even get reported to the wrong process!On recent Linux kernels, there's a good chance the error will be reported (to the correct process, even). Wilcox, PGCon 18 notes that an error on fsync is basically unrecoverable. The details for depending on filesystem -- on XFS and btrfs, modified data that's in the filesystem will get thrown away and there's no way to recover. On ext4, the data isn't thrown away, but it's marked as unmodified, so the filesystem won't try to write it back to disk later, and if there's memory pressure, the data can be thrown out at any time. If you're feeling adventurous, you can try to recover the data before it gets thrown out with various tricks (e.g., by forcing the filesystem to mark it as modified again, or by writing it out to another device, which will force the filesystem to write the data out even though it's marked as unmodified), but there's no guarantee you'll be able to recover the data before it's thrown out. On Linux ZFS, it appears that there's a code path designed to do the right thing, but CPU usage spikes and the system may hang or become unusable.In general, there isn't a good way to recover from this on Linux. Postgres, MySQL, and MongoDB (widely used databases) will crash themselves and the user is expected to restore from the last checkpoint. Most software will probably just silently lose or corrupt data. And fsync is a relatively good case -- for example, syncfs simply doesn't return errors on Linux at all, leading to silent data loss and data corruption.BTW, when Craig Ringer first proposed that Postgres should crash on fsync error, the first response on the Postgres dev mailing list was:Surely you jest . . . If [current behavior of fsync] is actually the case, we need to push back on this kernel brain damageBut after talking through the details, everyone agreed that crashing was the only good option. One of the many unfortunate things is that most disk errors are transient. Since the filesystem discards critical information that's necessary to proceed without data corruption on any error, transient errors that could be retried instead force software to take drastic measures.And while we've talked about Linux, this isn't unique to Linux. Fsync error handling (and error handling in general) is broken on many different operating systems. At the time Postgres "discovered" the behavior of fsync on Linux, FreeBSD had arguably correct behavior, but OpenBSD and NetBSD behaved the same as Linux (true error status dropped, retrying causes success response, data lost). This has been fixed on OpenBSD and probably some other BSDs, but Linux still basically has the same behavior and you don't have good guarantees that this will work on any random UNIX-like OS.Now that we've seen that, for many years, filesystems failed to handle errors in some of the most straightforward and simple cases and that there are cases that still aren't handled correctly today, let's look at disks.DiskFlushingWe've seen that it's easy to not realize we have to call fsync when we have to call fsync, and that even if we call fsync appropriately, bugs may prevent fsync from actually working. Rajimwale et al., DSN 11 into whether or not disks actually flush when you ask them to flush, assuming everything above the disk works correctly (their paper is actually mostly about something else, they just discuss this briefly at the beginning). Someone from Microsoft anonymously told them "[Some disks] do not allow the file system to force writes to disk properly" and someone from Seagate, a disk manufacturer, told them "[Some disks (though none from us)] do not allow the file system to force writes to disk properly". Bairavasundaram et al., FAST 07 also found the same thing when they looked into disk reliability.Error ratesWe've seen that filessystems sometimes don't handle disk errors correctly. If we want to know how serious this issue is, we should look at the rate at which disks emit errors. Disk datasheets will usually an uncorrectable bit error rate of 1e-14 for consumer HDDs (often called spinning metal or spinning rust disks), 1e-15 for enterprise HDDs, 1e-15 for consumer SSDs, and 1e-16 for enterprise SSDs. This means that, on average, we expect to see one unrecoverable data error every 1e14 bits we read on an HDD.To get an intuition for what this means in practice, 1TB is now a pretty normal disk size. If we read a full drive once, that's 1e12 bytes, or almost 1e13 bits (technically 8e12 bits), which means we should see, in expectation, one unrecoverable if we buy a 1TB HDD and read the entire disk ten-ish times. Nowadays, we can buy 10TB HDDs, in which case we'd expect to see an error (technically, 8/10th errors) on every read of an entire consumer HDD.In practice, observed data rates are are significantly higher. Narayanan et al., SYSTOR 16 (Microsoft) observed SSD error rates from 1e-11 to 6e-14, depending on the drive model. Meza et al., SIGMETRICS 15 (FB) observed even worse SSD error rates, 2e-9 to 6e-11 depending on the model of drive. Depending on the type of drive, 2e-9 is 2 gigabits, or 250 MB, 500 thousand to 5 million times worse than stated on datasheets depending on the class of drive.Bit error rate is arguably a bad metric for disk drives, but this is the metric disk vendors claim, so that's what we have to compare against if we want an apples-to-apples comparison. See Bairavasundaram et al., SIGMETRICS'07, Schroeder et al., FAST'16, and others for other kinds of error rates.One thing to note is that it's often claimed that SSDs don't have problems with corruption because they use error correcting codes (ECC), which can fix data corruption issues. "Flash banishes the specter of the unrecoverable data error", etc. The thing this misses is that modern high-density flash devices are very unreliable and need ECC to be usable at all. Grupp et al., FAST 12 looked at error rates of the kind of flash the underlies SSDs and found errors rates from 1e-1 to 1e-8. 1e-1 is one error every ten bits, 1e-8 is one error every 100 megabits.Power lossAnother claim you'll hear is that SSDs are safe against power loss and some types of crashes because they now have "power loss protection" -- there's some mechanism in the SSDs that can hold power for long enough during an outage that the internal SSD cache can be written out safely.Luke Leighton tested this by buying 6 SSDs that claim to have power loss protection and found that four out of the six models of drive he tested failed (every drive that wasn't an Intel drive). If we look at the details of the tests, when drives fail, it appears to be because they were used in a way that the implementor of power loss protection didn't expect (writing "too fast", although well under the rate at which the drive is capable of writing, or writing "too many" files in parallel). When a drive advertises that it has power loss protection, this appears to mean that someone spent some amount of effort implementing something that will, under some circumstances, prevent data loss or data corruption under power loss. But, as we saw in Kyle's talk yesterday on distributed systems, if you want to make sure that the mechanism actually works, you can't rely on the vendor to do rigorous or perhaps even any semi-serious testing and you have to test it yourself.RetentionIf we look at SSD datasheets, a young-ish drive (one with 90% of its write cycles remaining) will usually be specced to hold data for about ten years after a write. If we look at a worn out drive, one very close to end-of-life, it's specced to retain data for one year to three months, depending on the class of drive. I think people are often surprised to find that it's within spec for a drive to lose data three months after the data is written.These numbers all come from datasheets and specs, as we've seen, datasheets can be a bit optimistic. On many early SSDs, using up most or all of a drives write cycles would cause the drive to brick itself, so you wouldn't even get the spec'd three month data retention.CorollariesNow that we've seen that there are significant problems at every level of the file stack, let's look at a couple things that follow from this.What to do?What we should do about this is a big topic, in the time we have left, one thing we can do instead of writing to files is to use databases. If you want something lightweight and simple that you can use in most places you'd use a file, SQLite is pretty good. I'm not saying you should never use files. There is a tradeoff here. But if you have an application where you'd like to reduce the rate of data corruption, considering using a database to store data instead of using files.FS supportAt the start of this talk, we looked at this Dropbox example, where most people thought that there was no reason to remove support for most Linux filesystems because filesystems are all the same. I believe their hand was forced by the way they want to store/use data, which they can only do with ext given how they're doing things (which is arguably a mis-feature), but even if that wasn't the case, perhaps you can see why software that's attempting to sync data to disk reliably and with decent performance might not want to support every single filesystem in the universe for an OS that, for their product, is relatively niche. Maybe it's worth supporting every filesystem for PR reasons and then going through the contortions necessary to avoid data corruption on a per-filesystem basis (you can try coding straight to your reading of the POSIX spec, but as we've seen, that won't save you on Linux), but the PR problem is caused by a misunderstanding.The other comment we looked at on reddit, and also a common sentiment, is that it's not a program's job to work around bugs in libraries or the OS. But user data gets corrupted regardless of who's "fault" the bug is, and as we've seen, bugs can persist in the filesystem layer for many years. In the case of Linux, most filesystems other than ZFS seem to have decided it's correct behavior to throw away data on fsync error and also not report that the data can't be written (as opposed to FreeBSD or OpenBSD, where most filesystems will at least report an error on subsequent fsyncs if the error isn't resolved). This is arguably a bug and also arguably correct behavior, but either way, if your software doesn't take this into account, you're going to lose or corrupt data. If you want to take the stance that it's not your fault that the filesystem is corrupting data, your users are going to pay the cost for that.FAQWhile putting this talk to together, I read a bunch of different online discussions about how to write to files safely. For discussions outside of specialized communities (e.g., LKML, the Postgres mailing list, etc.), many people will drop by to say something like "why is everyone making this so complicated? You can do this very easily and completely safely with this one weird trick". Let's look at the most common "one weird trick"s from two thousand internet comments on how to write to disk safely.RenameThe most frequently mentioned trick is to rename instead of overwriting. If you remember our single-file write example, we made a copy of the data that we wanted to overwrite before modifying the file. The trick here is to do the opposite:Make a copy of the entire fileModify the copyRename the copy on top of the original fileThis trick doesn't work. People seem to think that this is safe becaus the POSIX spec says that rename is atomic, but that only means rename is atomic with respect to normal operation, that doesn't mean it's atomic on crash. This isn't just a theoretical problem; if we look at mainstream Linux filesystems, most have at least one mode where rename isn't atomic on crash. Rename also isn't guaranteed to execute in program order, as people sometimes expect.The most mainstream exception where rename is atomic on crash is probably btrfs, but even there, it's a bit subtle -- as noted in Bornholt et al., ASPLOS 16, rename is only atomic on crash when renaming to replace an existing file, not when renaming to create a new file. Also, Mohan et al., OSDI 18 found numerous rename atomicity bugs on btrfs, some quite old and some introduced the same year as the paper, so you want not want to rely on this without extensive testing, even if you're writing btrfs specific code.And even if this worked, the performance of this technique is quite poor.AppendThe second most frequently mentioned trick is to only ever append (instead of sometimes overwriting). This also doesn't work. As noted in Pillai et al., OSDI 14 and Bornholt et al., ASPLOS 16, appends don't guarantee ordering or atomicity and believing that appends are safe is the cause of some bugs.One weird tricksWe've seen that the most commonly cited simple tricks don't work. Something I find interesting is that, in these discussions, people will drop into a discussion where it's already been explained, often in great detail, why writing to files is harder than someone might naively think, ignore all warnings and explanations and still proceed with their explanation for why it's, in fact, really easy. Even when warned that files are harder than people think, people still think they're easy!ConclusionIn conclusion, computers don't work (but you probably already know this if you're here at Gary-conf). This talk happened to be about files, but there are many areas we could've looked into where we would've seen similar things.One thing I'd like to note before we finish is that, IMO, the underlying problem isn't technical. If you look at what huge tech companies do (companies like FB, Amazon, MS, Google, etc.), they often handle writes to disk pretty safely. They'll make sure that they have disks where power loss protection actually work, they'll have patches into the OS and/or other instrumentation to make sure that errors get reported correctly, there will be large distributed storage groups to make sure data is replicated safely, etc. We know how to make this stuff pretty reliable. It's hard, and it takes a lot of time and effort, i.e., a lot of money, but it can be done.If you ask someone who works on that kind of thing why they spend mind boggling sums of money to ensure (or really, increase the probability of) correctness, you'll often get an answer like "we have a zillion machines and if you do the math on the rate of data corruption, if we didn't do all of this, we'd have data corruption every minute of every day. It would be totally untenable". A huge tech company might have, what, order of ten million machines? The funny thing is, if you do the math for how many consumer machines there are out there and much consumer software runs on unreliable disks, the math is similar. There are many more consumer machines; they're typically operated at much lighter load, but there are enough of them that, if you own a widely used piece of desktop/laptop/workstation software, the math on data corruption is pretty similar. Without "extreme" protections, we should expect to see data corruption all the time.But if we look at how consumer software works, it's usually quite unsafe with respect to handling data. IMO, the key difference here is that when a huge tech company loses data, whether that's data on who's likely to click on which ads or user emails, the company pays the cost, directly or indirectly and the cost is large enough that it's obviously correct to spend a lot of effort to avoid data loss. But when consumers have data corruption on their own machines, they're mostly not sophisticated enough to know who's at fault, so the company can avoid taking the brunt of the blame. If we have a global optimization function, the math is the same -- of course we should put more effort into protecting data on consumer machines. But if we're a company that's locally optimizing for our own benefit, the math works out differently and maybe it's not worth it to spend a lot of effort on avoiding data corruption.Yesterday, Ramsey Nasser gave a talk where he made a very compelling case that something was a serious problem, which was followed up by a comment that his proposed solution will have a hard time getting adoption. I agree with both parts -- he discussed an important problem, and it's not clear how solving that problem will make anyone a lot of money, so the problem is likely to go unsolved.With GDPR, we've seen that regulation can force tech companies to protect people's privacy in a way they're not naturally inclined to do, but regulation is a very big hammer and the unintended consequences can often negate or more than negative the benefits of regulation. When we look at the history of regulations that are designed to force companies to do the right thing, we can see that it's often many years, sometimes decades, before the full impact of the regulation is understood. Designing good regulations is hard, much harder than any of the technical problems we've discussed today.AcknowledgementsThanks to Leah Hanson, Gary Bernhardt, Kamal Marhubi, Rebecca Isaacs, Jesse Luehrs, Tom Crayford, Wesley Aptekar-Cassels, Rose Ames, and Benjamin Gilbert for their help with this talk!Sorry we went so fast. If there's anything you missed you can catch it in the pseudo-transcript at danluu.com/deconstruct-files.This "transcript" is pretty rough since I wrote it up very quickly this morning before the talk. I'll try to clean it within a few weeks, which will include adding material that was missed, inserting links, fixing typos, adding references that were missed, etc.Thanks to Anatole Shaw, Jernej Simoncic, @junh1024, Yuri Vishnevsky, and Josh Duff for comments/corrections/discussion on this transcript.
A recurring discussion in Overwatch (as well as other online games) is whether or not women are treated differently from men. If you do a quick search, you can find hundreds of discussions about this, some of which have well over a thousand comments. These discussions tend to go the same way and involve the same debate every time, with the same points being made on both sides. Just for example, these three threads on reddit that spun out of a single post that have a total of 10.4k comments. On one side, you have people saying "sure, women get trash talked, but I'm a dude and I get trash talked, everyone gets trash talked there's no difference", "I've never seen this, it can't be real", etc., and on the other side you have people saying things like "when I play with my boyfriend, I get accused of being carried by him all the time but the reverse never happens", "people regularly tell me I should play mercy[, a character that's a female healer]", and so on and so forth. In less time than has been spent on a single large discussion, we could just run the experiment, so here it is.This is the result of playing 339 games in the two main game modes, quick play (QP) and competitive (comp), where roughly half the games were played with a masculine name (where the username was a generic term for a man) and half were played with a feminine name (where the username was a woman's name). I recorded all of the comments made in each of the games and then classified the comments by type. Classes of comments were "sexual/gendered comments", "being told how to play", "insults", and "compliments".In each game that's included, I decided to include the game (or not) in the experiment before the character selection screen loaded. In games that were included, I used the same character selection algorithm, I wouldn't mute anyone for spamming chat or being a jerk, I didn't speak on voice chat (although I had it enabled), I never sent friend requests, and I was playing outside of a group in order to get matched with 5 random players. When playing normally, I might choose a character I don't know how to use well and I'll mute people who pollute chat with bad comments. There are a lot of games that weren't included in the experiment because I wasn't in a mood to listen to someone rage at their team for fifteen minutes and the procedure I used involved pre-committing to not muting people who do that.Sexual or sexually charged commentsI thought I'd see more sexual comments when using the feminine name as opposed to the masculine name, but that turned out to not be the case. There was some mention of sex, genitals, etc., in both cases and the rate wasn't obviously different and was actually higher in the masculine condition.Zero games featured comments were directed specifically at me in the masculine condition and two (out of 184) games in the feminine condition featured comments that were directed at me. Most comments were comments either directed at other players or just general comments to team or game chat.Examples of typical undirected comments that would occur in either condition include ""my girlfriend keeps sexting me how do I get her to stop?", "going in balls deep", "what a surprise. *strokes dick* [during the post-game highlight]", and "support your local boobies".The two games that featured sexual comments directed at me had the following comments:"please mam can i have some coochie", "yes mam please" [from two different people], ":boicootie:""my dicc hard" [believed to be directed at me from context]During games not included in the experiment (I generally didn't pay attention to which username I was on when not in the experiment), I also got comments like "send nudes". Anecdotally, there appears to be a different in the rate of these kinds of comments directed at the player, but the rate observed in the experiment is so low that uncertainty intervals around any estimates of the true rate will be similar in both conditions unless we use a strong prior.The fact that this difference couldn't be observed in 339 games was surprising to me, although it's not inconsistent with McDaniel's thesis, a survey of women who play video games. 339 games probably sounds like a small number to serious gamers, but the only other randomized experiment I know of on this topic (besides this experiment) is Kasumovic et al., which notes that "[w]e stopped at 163 [games] as this is a substantial time effort".All of the analysis uses the number of games in which a type of comment occured and not tone to avoid having to code comments as having a certain tone in order to avoid possibly injecting bias into the process. Sentiment analysis models, even state-of-the-art ones often return nonsensical results, so this basically has to be done by hand, at least today. With much more data, some kind of sentiment analysis, done with liberal spot checking and re-training of the model, could work, but the total number of comments is so small in this case that it would amount to coding each comment by hand.Coding comments manually in an unbiased fashion can also be done with a level of blinding, but doing that would probably require getting more people involved (since I see and hear comments while I'm playing) and relying on unpaid or poorly paid labor.Being told how to playThe most striking, easy to quantify, difference was the rate at which I played games in which people told me how I should play. Since it's unclear how much confidence we should have in the difference if we just look at the raw rates, we'll use a simple statistical model to get the uncertainty interval around the estimates. Since I'm not sure what my belief about this should be, this uses an uninformative prior, so the estimate is close to the actual rate. Anyway, here are the uncertainty intervals a simple model puts on the percent of games where at least one person told me I was playing wrong, that I should change how I'm playing, or that I switch characters:table {border-collapse: collapse;}table,th,td {border: 1px solid black;}td {text-align:center;}CondEstP25P75F comp191325M comp6210F QP436M QP102The experimental conditions in this table are masculine vs. feminine name (M/F) and competitive mode vs quick play (comp/QP). The numbers are percents. Est is the estimate, P25 is the 25%-ile estimate, and P75 is the 75%-ile estimate. Competitive mode and using a feminine name are both correlated with being told how to play. See this post by Andrew Gelman for why you might want to look at the 50% interval instead of the 95% interval.For people not familiar with overwatch, in competitive mode, you're explicitly told what your ELO-like rating is and you get a badge that reflects your rating. In quick play, you have a rating that's tracked, but it's never directly surfaced to the user and you don't get a badge.It's generally believed that people are more on edge during competitive play and are more likely to lash out (and, for example, tell you how you should play). The data is consistent with this common belief.Per above, I didn't want to code tone of messages to avoid bias, so this table only indicates the rate at which people told me I was playing incorrectly or asked that I switch to a different character. The qualitative difference in experience is understated by this table. For example, the one time someone asked me to switch characters in the masculine condition, the request was a one sentence, polite, request ("hey, we're dying too quickly, could we switch [from the standard one primary healer / one off healer setup] to double primary healer or switch our tank to [a tank that can block more damage]?"). When using the feminine name, a typical case would involve 1-4 people calling me human garbage for most of the game and consoling themselves with the idea that the entire reason our team is losing is that I won't change characters.The simple model we're using indicates that there's probably a difference between both competitive and QP and playing with a masculine vs. a feminine name. However, most published results are pretty bogus, so let's look at reasons this result might be bogus and then you can decide for yourself.Threats to validityThe biggest issue is that this wasn't a pre-registered trial. I'm obviously not going to go and officially register a trial like this, but I also didn't informally "register" this by having this comparison in mind when I started the experiment. A problem with non-pre-registered trials is that there are a lot of degrees of freedom, both in terms of what we could look at, and in terms of the methodology we used to look at things, so it's unclear if the result is "real" or an artifact of fishing for something that looks interesting. A standard example of this is that, if you look for 100 possible effects, you're likely to find 1 that appears to be statistically significant with p = 0.01.There are standard techniques to correct for this problem (e.g., Bonferronicorrection), but I don't find these convincing because they usually don't capture all of the degrees of freedom that go into a statistical model. An example is that it's common to take a variable and discretize it into a few buckets. There are many ways to do this and you generally won't see papers talk about the impact of this or correct for this in any way, although changing how these buckets are arranged can drastically change the results of a study. Another common knob people can use to manipulate results is curve fitting to an inappropriate curve (often a 2nd a 3rd degree polynomial when a scatterplot shows that's clearly incorrect). Another way to handle this would be to use a more complex model, but I wanted to keep this as simple as possible.If I wanted to really be convinced on this, I'd want to, at a minimum, re-run this experiment with this exact comparison in mind. As a result, this experiment would need to be replicated to provide more than a preliminary result that is, at best, weak evidence.One other large class of problem with randomized controlled trials (RCTs) is that, despite randomization, the two arms of the experiment might be different in some way that wasn't randomized. Since Overwatch doesn't allow you to keep changing your name, this experiment was done with two different accounts and these accounts had different ratings in competitive mode. On average, the masculine account had a higher rating due to starting with a higher rating, which meant that I was playing against stronger players and having worse games on the masculine account. In the long run, this will even out, but since most games in this experiment were in QP, this didn't have time to even out in comp. As a result, I had a higher win rate as well as just generally much better games with the feminine account in comp.With no other information, we might expect that people who are playing worse get told how to play more frequently and people who are playing better should get told how to play less frequently, which would mean that the table above understates the actual difference.However Kasumovic et al., in a gender-based randomized trial of Halo 3, found that players who were playing poorly were more negative towards women, especially women who were playing well (there's enough statistical manipulation of the data that a statement this concise can only be roughly correct, see study for details). If that result holds, it's possible that I would've gotten fewer people telling me that I'm human garbage and need to switch characters if I was average instead of dominating most of my games in the feminine condition.If that result generalizes to OW, that would explain something which I thought was odd, which was that a lot of demands to switch and general vitriol came during my best performances with the feminine account. A typical example of this would be a game where we have a 2-2-2 team composition (2 players playing each of the three roles in the game) where my counterpart in the same role ran into the enemy team and died at the beginning of the fight in almost every engagement. I happened to be having a good day and dominated the other team (37-2 in a ten minute comp game, while focusing on protecting our team's healers) while only dying twice, once on purpose as a sacrifice and second time after a stupid blunder. Immediately after I died, someone asked me to switch roles so they could take over for me, but at no point did someone ask the other player in my role to switch despite their total uselesses all game (for OW players this was a Rein who immediately charged into the middle of the enemy team at every opportunity, from a range where our team could not possibly support them; this was Hanamura 2CP, where it's very easy for Rein to set up situations where their team cannot help them). This kind of performance was typical of games where my team jumped on me for playing incorrectly. This isn't to say I didn't have bad games; I had plenty of bad games, but a disproportionate number of the most toxic experiences came when I was having a great game.I tracked how well I did in games, but this sample doesn't have enough ranty games to do a meaningful statistical analysis of my performance vs. probability of getting thrown under the bus.Games at different ratings are probably also generally different environments and get different comments, but it's not clear if there are more negative comments at 2000 than 2500 or vice versa. There are a lot of online debates about this; for any rating level other than the very lowest or the very highest ratings, you can find a lot of people who say that the rating band they're in has the highest volume of toxic comments.Other differencesHere are some things that happened while playing with the feminine name that didn't happen with the masculine name during this experiment or in any game outside of this experiment:unsolicited "friend" requests from people I had no textual or verbal interaction with (happened 7 times total, didn't track which cases were in the experiment and which weren't)someone on the other team deciding that my team wasn't doing a good enough job of protecting me while I was playing healer, berating my team, and then throwing the game so that we won (happened once during the experiment)someone on my team flirting with me and then flipping out when I don't respond, who then spends the rest of the game calling me autistic or toxic (this happened once during the experiment, and once while playing in a game not included in the experiment)The rate of all these was low enough that I'd have to play many more games to observe something without a huge uncertainty interval.I didn't accept any friend requests from people I had no interaction with. Anecdotally, some people report people will send sexual comments or berate them after an unsolicited friend request. It's possible that the effect show in the table would be larger if I accepted these friend requests and it couldn't be smaller.I didn't attempt to classify comments as flirty or not because, unlike the kinds of commments I did classify, this is often somewhat subtle and you could make a good case that any particular comment is or isn't flirting. Without responding (which I didn't do), many of these kinds of comments are ambiguousAnother difference was in the tone of the compliments. The rate of games where I was complimented wasn't too different, but compliments under the masculine condition tended to be short and factual (e.g., someone from the other team saying "no answer for [name of character I was playing]" after a dominant game) and compliments under the feminine condition tended to be more effusive and multiple people would sometimes chime in about how great I was.Non differencesThe rate of complements and the rate of insults in games that didn't include explanations of how I'm playing wrong or how I need to switch characters were similar in both conditions.Other factorsSome other factors that would be interesting to look at would be time of day, server, playing solo or in a group, specific character choice, being more or less communicative, etc., but it would take a lot more data to be able to get good estimates when adding it more variables. Blizzard should have the data necessary to do analyses like this in aggregate, but they're notoriously private with their data, so someone at Blizzard would have to do the work and then publish it publicly, and they're not really in the habit of doing that kind of thing. If you work at Blizzard and are interested in letting a third party do some analysis on an anonymized data set, let me know and I'd be happy to dig in.Experimental minutiaeUnder both conditions, I avoided ever using voice chat and would call things out in text chat when time permitted. Also under both conditions, I mostly filled in with whatever character class the team needed most, although I'd sometimes pick DPS (in general, DPS are heavily oversubscribed, so you'll rarely play DPS if you don't pick one even when unnecessary).For quickplay, backfill games weren't counted (backfill games are games where you join after the game started to fill in for a player who left; comp doesn't allow backfills). 6% of QP games were backfills.These games are from before the "endorsements" patch; most games were played around May 2018. All games were played in "solo q" (with 5 random teammates). In order to avoid correlations between games depending on how long playing sessions were, I quit between games and waited for enough time (since you're otherwise likely to end up in a game with some or many of the same players as before).The model used probability of a comment happening in a game to avoid the problem that Kasumovic et al. ran into, where a person who's ranting can skew the total number of comments. Kasumovic et al. addressed this by removing outliers, but I really don't like manually reaching in and removing data to adjust results. This could also be addressed by using a more sophisticated model, but a more sophisticated model means more knobs which means more ways for bias to sneak in. Using the number of players who made comments instead would be one way to mitigate this problem, but I think this still isn't ideal because these aren't independent -- when one player starts being negative, this greatly increases the odds that another player in that game will be negative, but just using the number of players makes four games with one negative person the same as one game with four negative people. This can also be accounted for with a slightly more sophisticated model, but that also involves adding more knobs to the model.UPDATE: 98%-ileOne of the more common comments I got when I wrote this post is that it's only valid at "low" ratings, like Plat, which is 50%-ile. If someone is going to concede that a game's community is toxic at 50%-ile and you have to be significantly better than that to avoid toxic players, that seems to be conceding that the game's community is toxic.However, to see if that's accurate, I played a bit more and play in games as high as 98%-ile to see if things improved. While there was a minor improvement, it's not fundamentally different at 98%-ile, so people who are saying that things are much better at higher ranks either have very different experiences than I did or are referring to 99%-ile or above. If it's the latter, then I'd say that the previous comment about conceding that the game has a toxic community holds. If it's the former, perhaps I just got unlucky, but based on other people's comments about their experiences with the game, I don't think I got particularly unlucky.Appendix: comments / advice to overwatch playersA common complaint, perhaps the most common complaint by people below 2000 SR (roughly 30%-ile) or perhaps 1500 SR (roughly 10%-ile) is that they're in "ELO hell" and are kept down because their teammates are too bad. Based on my experience, I find this to be extremely unlikely.People often split skill up into "mechanics" and "gamesense". My mechanics are pretty much as bad as it's possible to get. The last game I played seriously was a 90s video game that's basically online asteroids and the last game before that I put any time into was the original SNES super mario kart. As you'd expect from someone who hasn't put significant time into a post-90s video game or any kind of FPS game, my aim and dodging are both atrocious. On top of that, I'm an old dude with slow reflexes and I was able to get to 2500 SR (roughly 60%-ile among players who play "competitive", likely higher among all players) by avoiding a few basic fallacies and blunders despite have approximately zero mechanical skill. If you're also an old dude with basically no FPS experience, you can do the same thing; if you have good reflexes or enough FPS experience to actually aim or dodge, you basically can't be worse mechnically than I am and you can do much better by avoiding a few basic mistakes.The most common fallacy I see repeated is that you have to play DPS to move out of bronze or gold. The evidence people give for this is that, when a GM streamer plays flex, tank, or healer, they sometimes lose in bronze. I guess the idea is that, because the only way to ensure a 99.9% win rate in bronze is to be a GM level DPS player and play DPS, the best way to maintain a 55% or a 60% win rate is to play DPS, but this doesn't follow.Healers and tanks are both very powerful in low ranks. Because low ranks feature both poor coordination and relatively poor aim (players with good coordination or aim tend to move up quickly), time-to-kill is very slow compared to higher ranks. As a result, an off healer can tilt the result of a 1v1 (and sometimes even a 2v1) matchup and a primary healer can often determine the result of a 2v1 matchup. Because coordination is poor, most matchups end up being 2v1 or 1v1. The flip side of the lack of coordination is that you'll almost never get help from teammates. It's common to see an enemy player walk into the middle of my team, attack someone, and then walk out while literally no one else notices. If the person being attacked is you, the other healer typically won't notice and will continue healing someone at full health and none of the classic "peel" characters will help or even notice what's happening. That means it's on you to pay attention to your surroundings and watching flank routes to avoid getting murdered.If you can avoid getting murdered constantly and actually try to heal (as opposed to many healers at low ranks, who will try to kill people or stick to a single character and continue healing them all the time even if they're at full health), you outheal a primary healer half the time when playing an off healer and, as a primary healer, you'll usually be able to get 10k-12k healing per 10 min compared to 6k to 8k for most people in Silver (sometimes less if they're playing DPS Moira). That's like having an extra half a healer on your team, which basically makes the game 6.5 v 6 instead of 6v6. You can still lose a 6.5v6 game, and you'll lose plenty of games, but if you're consistently healing 50% more than an normal healer at your rank, you'll tend to move up even if you get a lot of major things wrong (heal order, healing when that only feeds the other team, etc.).A corollary to having to watch out for yourself 95% when playing a healer is that, as a character who can peel, you can actually watch out for your teammates and put your team at a significant advantage in 95% of games. As Zarya or Hog, if you just boringly play towards the front of your team, you can basically always save at least one teammate from death in a team fight, and you can often do this 2 or 3 times. Meanwhile, your counterpart on the other team is walking around looking for 1v1 matchups. If they find a good one, they'll probably kill someone, and if they don't (if they run into someone with a mobility skill or a counter like brig or reaper), they won't. Even in the case where they kill someone and you don't do a lot, you still provide as much value as them and, on average, you'll provide more value. A similar thing is true of many DPS characters, although it depends on the character (e.g., McCree is effective as a peeler, at least at the low ranks that I've played in). If you play a non-sniper DPS that isn't suited for peeling, you can find a DPS on your team who's looking for 1v1 fights and turn those fights into 2v1 fights (at low ranks, there's no shortage of these folks on both teams, so there are plenty of 1v1 fights you can control by making them 2v1).All of these things I've mentioned amount to actually trying to help your team instead of going for flashy PotG setups or trying to dominate the entire team by yourself. If you say this in the abstract, it seems obvious, but most people think they're better than their rating. It doesn't help that OW is designed to make people think they're doing well when they're not and the best way to get "medals" or "play of the game" is to play in a way that severely reduces your odds of actually winning each game.Outside of obvious gameplay mistakes, the other big thing that loses games is when someone tilts and either starts playing terribly or flips out and says something to enrage someone else on the team, who then starts playing terribly. I don't think you can actually do much about this directly, but you can never do this, so 5/6th of your team will do this at some base rate, whereas 6/6 of the other team will do this. Like all of the above, this won't cause you to win all of your games, but everything you do that increases your win rate makes a difference.Poker players have the right attitude when they talk about leaks. The goal isn't to win every hand, it's to increase your EV by avoiding bad blunders (at high levels, it's about more than avoiding bad blunders, but we're talking about getting out of below median ranks, not becoming GM here). You're going to have terrible games where you get 5 people instalocking DPS. Your odds of winning a game are low, say 10%. If you get mad and pick DPS and reduce your odds even further (say this is to 2%), all that does is create a leak in your win rate during games when your teammates are being silly.If you gain/lose 25 rating per game for a win or a loss, your average rating change from a game is 25 (W_rate - L_rate) = 25 (2W_rate - 1). Let's say 1/40 games are these silly games where your team decides to go all DPS. The per-game SR difference of trying to win these vs. soft throwing is maybe something like 1/40 * 25 (2 * 0.08) = 0.1. That doesn't sound like much and these numbers are just guesses, but everyone outside of very high-level games is full of leaks like these, and they add up. And if you look at a 60% win rate, which is pretty good considering that your influence is limited because you're only one person on a 6 person team, that only translates to an average of 5SR per game, so it doesn't actually take that many small leaks to really move your average SR gain or loss.Appendix: general comments on online gaming, 20 years ago vs. todaySince I'm unlikely to write another blog post on gaming any time soon, here are some other random thoughts that won't fit with any other post. My last serious experience with online games was with a game from the 90s. Even though I'd heard that things were a lot worse, I was still surprised by it. IRL, the only time I encounter the same level and rate of pointless nastiness in a recreational activity is down at the bridge club (casual bridge games tend to be very nice). When I say pointless nastiness, I mean things like getting angry and then making nasty comments to a teammate mid-game. Even if your "criticism" is correct (and, if you review OW games or bridge hands, you'll see that these kinds of angry comments are almost never correct), this has virtually no chance of getting your partner to change their behavior and it has a pretty good chance of tilting them and making them play worse. If you're trying to win, there's no reason to do this and good reason to avoid this.If you look at the online commentary for this, it's common to see people blaming kids, but this doesn't match my experience at all. For one thing, when I was playing video games in the 90s, a huge fraction of the online gaming population was made up of kids, and online game communities were nicer than they are today. Saying that "kids nowadays" are worse than kids used to be is a pastime that goes back thousands of years, but it's generally not true and there doesn't seem to be any reason to think that it's true here.Additionally, this simply doesn't match what I saw. If I just look at comments over audio chat, there were a couple of times when some kids were nasty, but almost all of the comments are from people who sound like adults. Moreover, if I look at when I played games that were bad, a disproportionately large number of those games were late (after 2am eastern time, on the central/east server), where the relative population of adults is larger.And if we look at bridge, the median age of an ACBL member is in the 70s, with an increase in age of a whopping 0.4 years per year.Sure, maybe people tend to get more mature as they age, but in any particular activity, that effect seems to be dominated by other factors. I don't have enough data at hand to make a good guess as to what happened, but I'm entertained by the idea that this might have something to do with it:I ve said this before, but one of the single biggest culture shocks I ve ever received was when I was talking to someone about five years younger than I was, and she said Wait, you play video games? I m surprised. You seem like way too much of a nerd to play video games. Isn t that like a fratboy jock thing? Appendix: FAQHere are some responses to the most common online comments.Plat? You suck at OverwatchYep. But I sucked roughly equally on both accounts (actually somewhat more on the masculine account because it was rated higher and I was playing a bit out of my depth). Also, that's not a question.This is just a blog post, it's not an academic study, the results are crap.There's nothing magic about academic papers. I have my name on a few publications, including one that won best paper award at the top conference in its field. My median blog post is more rigorous than my median paper or, for that matter, the median paper that I read.When I write a paper, I have to deal with co-authors who push for putting in false or misleading material that makes the paper look good and my ability to push back against this has been fairly limited. On my blog, I don't have to deal with that and I can write up results that are accurate (to the best of my abillity) even if it makes the result look less interesting or less likely to win an award.Gamers have always been toxic, that's just nostalgia talking.If I pull game logs for subspace, this seems to be false. YMMV depending on what games you played, I suppose. FWIW, airmash seems to be the modern version of subspace, and (until the game died), it was much more toxic than subspace even if you just compare on a per-game basis despite having much smaller games (25 people for a good sized game in airmash, vs. 95 for subsace).This is totally invalid because you didn't talk on voice chat.At the ranks I played, not talking on voice was the norm. It would be nice to have talking or not talking on voice chat be an indepedent variable, but that would require playing even more games to get data for another set of conditions, and if I wasn't going to do that, choosing the condition that's most common doesn't make the entire experiment invalid, IMO.Some people report that, post "endorsements" patch, talking on voice chat is much more common. I tested this out by playing 20 (non-comp) games just after the "Paris" patch. Three had comments on voice chat. One was someone playing random music clips, one had someone screaming at someone else for playing incorrectly, and one had useful callouts on voice chat. It's possible I'd see something different with more games or in comp, but I don't think it's obvious that voice chat is common for most people after the "endorsements" patch.Appendix: code and dataIf you want to play with this data and model yourself, experiment with different priors, run a posterior predictive check, etc., here's a snippet of R code that embeds the data:library(brms)library(modelr)library(tidybayes)library(tidyverse)d <- tribble( ~game_type, ~gender, ~xplain, ~games, "comp", "female", 7, 35, "comp", "male", 1, 23, "qp", "female", 6, 149, "qp", "male", 2, 132)d <- d %>% mutate(female = ifelse(gender == "female", 1, 0), comp = ifelse(game_type == "comp", 1, 0))result <- brm(data = d, family = binomial, xplain | trials(games) ~ female + comp, prior = c(set_prior("normal(0,10)", class = "b")), iter = 25000, warmup = 500, cores = 4, chains = 4)The model here is simple enough that I wouldn't expect the version of software used to significantly affect results, but in case you're curious, this was done with brms 2.7.0, rstan 2.18.2, on R 3.5.1.Thanks to Leah Hanson, Sean Talts and Sean's math/stats reading group, Annie Cherkaev,Robert Schuessler, Wesley Aptekar-Cassels, Julia Evans, Paul Gowder, Jonathan Dahan, Bradley Boccuzzi, Akiva Leffert, and one or more anonymous commenters for comments/corrections/discussion.
This is an archive of the original "fsyncgate" email thread. This is posted here because I wanted to have a link that would fit on a slide for a talk on file safety with a mobile-friendly non-bloated format.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Subject:Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFSDate:2018-03-28 02:23:46Hi allSome time ago I ran into an issue where a user encountered data corruptionafter a storage error. PostgreSQL played a part in that corruption byallowing checkpoint what should've been a fatal error.TL;DR: Pg should PANIC on fsync() EIO return. Retrying fsync() is not OK atleast on Linux. When fsync() returns success it means "all writes since thelast fsync have hit disk" but we assume it means "all writes since the lastSUCCESSFUL fsync have hit disk".Pg wrote some blocks, which went to OS dirty buffers for writeback.Writeback failed due to an underlying storage error. The block I/O layerand XFS marked the writeback page as failed (AS_EIO), but had no way totell the app about the failure. When Pg called fsync() on the FD during thenext checkpoint, fsync() returned EIO because of the flagged page, to tellPg that a previous async write failed. Pg treated the checkpoint as failedand didn't advance the redo start position in the control file.All good so far.But then we retried the checkpoint, which retried the fsync(). The retrysucceeded, because the prior fsync() cleared the AS_EIO bad page flag.The write never made it to disk, but we completed the checkpoint, andmerrily carried on our way. Whoops, data loss.The clear-error-and-continue behaviour of fsync is not documented as far asI can tell. Nor is fsync() returning EIO unless you have a very new linuxman-pages with the patch I wrote to add it. But from what I can see in thePOSIX standard we are not given any guarantees about what happens onfsync() failure at all, so we're probably wrong to assume that retryingfsync( ) is safe.If the server had been using ext3 or ext4 with errors=remount-ro, theproblem wouldn't have occurred because the first I/O error would'veremounted the FS and stopped Pg from continuing. But XFS doesn't have thatoption. There may be other situations where this can occur too, involvingLVM and/or multipath, but I haven't comprehensively dug out the details yet.It proved possible to recover the system by faking up a backup label frombefore the first incorrectly-successful checkpoint, forcing redo to repeatand write the lost blocks. But ... what a mess.I posted about the underlying fsync issue here some time ago:https://stackoverflow.com/q/42434872/398670but haven't had a chance to follow up about the Pg specifics.I've been looking at the problem on and off and haven't come up with a goodanswer. I think we should just PANIC and let redo sort it out by repeatingthe failed write when it repeats work since the last checkpoint.The API offered by async buffered writes and fsync offers us no way to findout which page failed, so we can't just selectively redo that write. Ithink we do know the relfilenode associated with the fd that failed tofsync, but not much more. So the alternative seems to be some sort ofpotentially complex online-redo scheme where we replay WAL only therelation on which we had the fsync() error, while otherwise servicingqueries normally. That's likely to be extremely error-prone and hard totest, and it's trying to solve a case where on other filesystems the wholeDB would grind to a halt anyway.I looked into whether we can solve it with use of the AIO API instead, butthe mess is even worse there - from what I can tell you can't even reliablyguarantee fsync at all on all Linux kernel versions.We already PANIC on fsync() failure for WAL segments. We just need to dothe same for data forks at least for EIO. This isn't as bad as it seemsbecause AFAICS fsync only returns EIO in cases where we should be stoppingthe world anyway, and many FSes will do that for us.There are rather a lot of pg_fsync() callers. While we could handle thiscase-by-case for each one, I'm tempted to just make pg_fsync() itselfintercept EIO and PANIC. Thoughts?From:Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>Date:2018-03-28 03:53:08Craig Ringer writes:TL;DR: Pg should PANIC on fsync() EIO return.Surely you jest.Retrying fsync() is not OK atleast on Linux. When fsync() returns success it means "all writes since thelast fsync have hit disk" but we assume it means "all writes since the lastSUCCESSFUL fsync have hit disk".If that's actually the case, we need to push back on this kernel braindamage, because as you're describing it fsync would be completely useless.Moreover, POSIX is entirely clear that successful fsync means allpreceding writes for the file have been completed, full stop, doesn'tmatter when they were issued.From:Michael Paquier <michael(at)paquier(dot)xyz>Date:2018-03-29 02:30:59On Tue, Mar 27, 2018 at 11:53:08PM -0400, Tom Lane wrote:Craig Ringer writes:TL;DR: Pg should PANIC on fsync() EIO return.Surely you jest.Any callers of pg_fsync in the backend code are careful enough to checkthe returned status, sometimes doing retries like in mdsync, so what isproposed here would be a regression.From:Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>Date:2018-03-29 02:48:27On Thu, Mar 29, 2018 at 3:30 PM, Michael Paquier wrote:On Tue, Mar 27, 2018 at 11:53:08PM -0400, Tom Lane wrote:Craig Ringer writes:TL;DR: Pg should PANIC on fsync() EIO return.Surely you jest.Any callers of pg_fsync in the backend code are careful enough to checkthe returned status, sometimes doing retries like in mdsync, so what isproposed here would be a regression.Craig, is the phenomenon you described the same as the second issue"Reporting writeback errors" discussed in this article?https://lwn.net/Articles/724307/"Current kernels might report a writeback error on an fsync() call,but there are a number of ways in which that can fail to happen."That's... I'm speechless.From:Justin Pryzby <pryzby(at)telsasoft(dot)com>Date:2018-03-29 05:00:31On Thu, Mar 29, 2018 at 11:30:59AM +0900, Michael Paquier wrote:On Tue, Mar 27, 2018 at 11:53:08PM -0400, Tom Lane wrote:Craig Ringer writes:TL;DR: Pg should PANIC on fsync() EIO return.Surely you jest.Any callers of pg_fsync in the backend code are careful enough to checkthe returned status, sometimes doing retries like in mdsync, so what isproposed here would be a regression.The retries are the source of the problem ; the first fsync() can return EIO,and also clears the error causing a 2nd fsync (of the same data) to returnsuccess.(Note, I can see that it might be useful to PANIC on EIO but retry for ENOSPC).On Thu, Mar 29, 2018 at 03:48:27PM +1300, Thomas Munro wrote:Craig, is the phenomenon you described the same as the second issue"Reporting writeback errors" discussed in this article?https://lwn.net/Articles/724307/Worse, the article acknowledges the behavior without apparently suggesting tochange it:"Storing that value in the file structure has an important benefit: it makesit possible to report a writeback error EXACTLY ONCE TO EVERY PROCESS THATCALLS FSYNC() .... In current kernels, ONLY THE FIRST CALLER AFTER AN ERROROCCURS HAS A CHANCE OF SEEING THAT ERROR INFORMATION."I believe I reproduced the problem behavior using dmsetup "error" target, seeattached.strace looks like this:kernel is Linux 4.10.0-28-generic #32~16.04.2-Ubuntu SMP Thu Jul 20 10:19:48 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux1open("/dev/mapper/eio", O_RDWR|O_CREAT, 0600) = 32write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 81923write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 81924write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 81925write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 81926write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 81927write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 81928write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 25609write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = -1 ENOSPC (No space left on device)10dup(2) = 411fcntl(4, F_GETFL) = 0x8402 (flags O_RDWR|O_APPEND|O_LARGEFILE)12brk(NULL) = 0x129900013brk(0x12ba000) = 0x12ba00014fstat(4, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 015write(4, "write(1): No space left on devic"..., 34write(1): No space left on device16) = 3417close(4) = 018fsync(3) = -1 EIO (Input/output error)19dup(2) = 420fcntl(4, F_GETFL) = 0x8402 (flags O_RDWR|O_APPEND|O_LARGEFILE)21fstat(4, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 022write(4, "fsync(1): Input/output error\n", 29fsync(1): Input/output error23) = 2924close(4) = 025close(3) = 026open("/dev/mapper/eio", O_RDWR|O_CREAT, 0600) = 327fsync(3) = 028write(3, "\0", 1) = 129fsync(3) = 030exit_group(0) = ?2: EIO isn't seen initially due to writeback page cache;9: ENOSPC due to small device18: original IO error reported by fsync, good25: the original FD is closed26: ..and file reopened27: fsync on file with still-dirty data+EIO returns success BAD10, 19: I'm not sure why there's dup(2), I guess glibc thinks that perrorshould write to a separate FD (?)Also note, close() ALSO returned success..which you might think exonerates the2nd fsync(), but I think may itself be problematic, no? In any case, the 2ndbyte certainly never got written to DM error, and the failure status was lostfollowing fsync().I get the exact same behavior if I break after one write() loop, such as toavoid ENOSPC.From:Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>Date:2018-03-29 05:06:22On Thu, Mar 29, 2018 at 6:00 PM, Justin Pryzby wrote:The retries are the source of the problem ; the first fsync() can return EIO,and also clears the error causing a 2nd fsync (of the same data) to returnsuccess.What I'm failing to grok here is how that error flag even matters,whether it's a single bit or a counter as described in that patch. Ifwrite back failed, the page is still dirty. So all future calls tofsync() need to try to try to flush it again, and (presumably) failagain (unless it happens to succeed this time around).From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-03-29 05:25:51On 29 March 2018 at 13:06, Thomas Munro wrote:On Thu, Mar 29, 2018 at 6:00 PM, Justin Pryzby wrote:The retries are the source of the problem ; the first fsync() can return EIO,and also clears the error causing a 2nd fsync (of the same data) to returnsuccess.What I'm failing to grok here is how that error flag even matters,whether it's a single bit or a counter as described in that patch. Ifwrite back failed, the page is still dirty. So all future calls tofsync() need to try to try to flush it again, and (presumably) failagain (unless it happens to succeed this time around).http://www.enterprisedb.comYou'd think so. But it doesn't appear to work that way. You can seeyourself with the error device-mapper destination mapped over part of avolume.I wrote a test case here.https://github.com/ringerc/scrapcode/blob/master/testcases/fsync-error-clear.cI don't pretend the kernel behaviour is sane. And it's possible I've madean error in my analysis. But since I've observed this in the wild, and seenit in a test case, I strongly suspect that's what I've described is justwhat's happening, brain-dead or no.Presumably the kernel marks the page clean when it dispatches it to the I/Osubsystem and doesn't dirty it again on I/O error? I haven't dug that deepon the kernel side. See the stackoverflow post for details on what I foundin kernel code analysis.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-03-29 05:32:43On 29 March 2018 at 10:48, Thomas Munro wrote:On Thu, Mar 29, 2018 at 3:30 PM, Michael Paquier wrote:On Tue, Mar 27, 2018 at 11:53:08PM -0400, Tom Lane wrote:Craig Ringer writes:TL;DR: Pg should PANIC on fsync() EIO return.Surely you jest.Any callers of pg_fsync in the backend code are careful enough to checkthe returned status, sometimes doing retries like in mdsync, so what isproposed here would be a regression.Craig, is the phenomenon you described the same as the second issue"Reporting writeback errors" discussed in this article?https://lwn.net/Articles/724307/A variant of it, by the looks.The problem in our case is that the kernel only tells us about the erroronce. It then forgets about it. So yes, that seems like a variant of thestatement:"Current kernels might report a writeback error on an fsync() call,but there are a number of ways in which that can fail to happen."That's... I'm speechless.Yeah.It's a bit nuts.I was astonished when I saw the behaviour, and that it appears undocumented.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-03-29 05:35:47On 29 March 2018 at 10:30, Michael Paquier wrote:On Tue, Mar 27, 2018 at 11:53:08PM -0400, Tom Lane wrote:Craig Ringer writes:TL;DR: Pg should PANIC on fsync() EIO return.Surely you jest.Any callers of pg_fsync in the backend code are careful enough to checkthe returned status, sometimes doing retries like in mdsync, so what isproposed here would be a regression.I covered this in my original post.Yes, we check the return value. But what do we do about it? For fsyncs ofheap files, we ERROR, aborting the checkpoint. We'll retry the checkpointlater, which will retry the fsync(). Which will now appear to succeedbecause the kernel forgot that it lost our writes after telling us thefirst time. So we do check the error code, which returns success, and wecomplete the checkpoint and move on.But we only retried the fsync, not the writes before the fsync.So we lost data. Or rather, failed to detect that the kernel did so, so ourcheckpoint was bad and could not be completed.The problem is that we keep retrying checkpoints without repeating thewrites leading up to the checkpoint, and retrying fsync.I don't pretend the kernel behaviour is sane, but we'd better deal with itanyway.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-03-29 05:58:45On 28 March 2018 at 11:53, Tom Lane wrote:Craig Ringer writes:TL;DR: Pg should PANIC on fsync() EIO return.Surely you jest.No. I'm quite serious. Worse, we quite possibly have to do it for ENOSPC aswell to avoid similar lost-page-write issues.It's not necessary on ext3/ext4 with errors=remount-ro, but that's onlybecause the FS stops us dead in our tracks.I don't pretend it's sane. The kernel behaviour is IMO crazy. If it's goingto lose a write, it should at minimum mark the FD as broken so no furtherfsync() or anything else can succeed on the FD, and an app that cares aboutdurability must repeat the whole set of work since the prior succesfulfsync(). Just reporting it once and forgetting it is madness.But even if we convince the kernel folks of that, how do other platformsbehave? And how long before these kernels are out of use? We'd better dealwith it, crazy or no.Please see my StackOverflow post for the kernel-level explanation. Notealso the test case link there. https://stackoverflow.com/a/42436054/398670Retrying fsync() is not OK atleast on Linux. When fsync() returns success it means "all writes since thelast fsync have hit disk" but we assume it means "all writes since the lastSUCCESSFUL fsync have hit disk".If that's actually the case, we need to push back on this kernel braindamage, because as you're describing it fsync would be completely useless.It's not useless, it's just telling us something other than what we thinkit means. The promise it seems to give us is that if it reports an erroronce, everything after that is useless, so we should throw our toys,close and reopen everything, and redo from the last known-good state.Though as Tomas posted below, it provides rather weaker guarantees than Ithought in some other areas too. See that lwn.net article he linked.Moreover, POSIX is entirely clear that successful fsync means allpreceding writes for the file have been completed, full stop, doesn'tmatter when they were issued.I can't find anything that says so to me. Please quote relevant spec.I'm working fromhttp://pubs.opengroup.org/onlinepubs/009695399/functions/fsync.html whichstates that"The fsync() function shall request that all data for the open filedescriptor named by fildes is to be transferred to the storage deviceassociated with the file described by fildes. The nature of the transfer isimplementation-defined. The fsync() function shall not return until thesystem has completed that action or until an error is detected."My reading is that POSIX does not specify what happens AFTER an error isdetected. It doesn't say that error has to be persistent and thatsubsequent calls must also report the error. It also says:"If the fsync() function fails, outstanding I/O operations are notguaranteed to have been completed."but that doesn't clarify matters much either, because it can be read tomean that once there's been an error reported for some IO operationsthere's no guarantee those operations are ever completed even after asubsequent fsync returns success.I'm not seeking to defend what the kernel seems to be doing. Rather, sayingthat we might see similar behaviour on other platforms, crazy or not. Ihaven't looked past linux yet, though.From:Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>Date:2018-03-29 12:07:56On Thu, Mar 29, 2018 at 6:58 PM, Craig Ringer wrote:On 28 March 2018 at 11:53, Tom Lane wrote:Craig Ringer writes:TL;DR: Pg should PANIC on fsync() EIO return.Surely you jest.No. I'm quite serious. Worse, we quite possibly have to do it for ENOSPC aswell to avoid similar lost-page-write issues.I found your discussion with kernel hacker Jeff Layton athttps://lwn.net/Articles/718734/ in which he said: "The stackoverflowwriteup seems to want a scheme where pages stay dirty after awriteback failure so that we can try to fsync them again. Note thatthat has never been the case in Linux after hard writeback failures,AFAIK, so programs should definitely not assume that behavior."The article above that says the same thing a couple of different ways,ie that writeback failure leaves you with pages that are neitherwritten to disk successfully nor marked dirty.If I'm reading various articles correctly, the situation was evenworse before his errseq_t stuff landed. That fixed cases ofcompletely unreported writeback failures due to sharing of PG_errorfor both writeback and read errors with certain filesystems, but itdoesn't address the clean pages problem.Yeah, I see why you want to PANIC.Moreover, POSIX is entirely clear that successful fsync means allpreceding writes for the file have been completed, full stop, doesn'tmatter when they were issued.I can't find anything that says so to me. Please quote relevant spec.I'm working fromhttp://pubs.opengroup.org/onlinepubs/009695399/functions/fsync.html whichstates that"The fsync() function shall request that all data for the open filedescriptor named by fildes is to be transferred to the storage deviceassociated with the file described by fildes. The nature of the transfer isimplementation-defined. The fsync() function shall not return until thesystem has completed that action or until an error is detected."My reading is that POSIX does not specify what happens AFTER an error isdetected. It doesn't say that error has to be persistent and that subsequentcalls must also report the error. It also says:FWIW my reading is the same as Tom's. It says "all data for the openfile descriptor" without qualification or special treatment aftererrors. Not "some".I'm not seeking to defend what the kernel seems to be doing. Rather, sayingthat we might see similar behaviour on other platforms, crazy or not. Ihaven't looked past linux yet, though.I see no reason to think that any other operating system would behavethat way without strong evidence... This is openly acknowledged to be"a mess" and "a surprise" in the Filesystem Summit article. I am notreally qualified to comment, but from a cursory glance at FreeBSD'svfs_bio.c I think it's doing what you'd hope for... see the code nearthe comment "Failed write, redirty."From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-03-29 13:15:10On 29 March 2018 at 20:07, Thomas Munro wrote:On Thu, Mar 29, 2018 at 6:58 PM, Craig Ringer wrote:On 28 March 2018 at 11:53, Tom Lane wrote:Craig Ringer writes:TL;DR: Pg should PANIC on fsync() EIO return.Surely you jest.No. I'm quite serious. Worse, we quite possibly have to do it for ENOSPCaswell to avoid similar lost-page-write issues.I found your discussion with kernel hacker Jeff Layton athttps://lwn.net/Articles/718734/ in which he said: "The stackoverflowwriteup seems to want a scheme where pages stay dirty after awriteback failure so that we can try to fsync them again. Note thatthat has never been the case in Linux after hard writeback failures,AFAIK, so programs should definitely not assume that behavior."The article above that says the same thing a couple of different ways,ie that writeback failure leaves you with pages that are neitherwritten to disk successfully nor marked dirty.If I'm reading various articles correctly, the situation was evenworse before his errseq_t stuff landed. That fixed cases ofcompletely unreported writeback failures due to sharing of PG_errorfor both writeback and read errors with certain filesystems, but itdoesn't address the clean pages problem.Yeah, I see why you want to PANIC.In more ways than one ;)I'm not seeking to defend what the kernel seems to be doing. Rather,sayingthat we might see similar behaviour on other platforms, crazy or not. Ihaven't looked past linux yet, though.I see no reason to think that any other operating system would behavethat way without strong evidence... This is openly acknowledged to be"a mess" and "a surprise" in the Filesystem Summit article. I am notreally qualified to comment, but from a cursory glance at FreeBSD'svfs_bio.c I think it's doing what you'd hope for... see the code nearthe comment "Failed write, redirty."Ok, that's reassuring, but doesn't help us on the platform the greatmajority of users deploy on :("If on Linux, PANIC"Hrm.From:Catalin Iacob <iacobcatalin(at)gmail(dot)com>Date:2018-03-29 16:20:00On Thu, Mar 29, 2018 at 2:07 PM, Thomas Munro wrote:I found your discussion with kernel hacker Jeff Layton athttps://lwn.net/Articles/718734/ in which he said: "The stackoverflowwriteup seems to want a scheme where pages stay dirty after awriteback failure so that we can try to fsync them again. Note thatthat has never been the case in Linux after hard writeback failures,AFAIK, so programs should definitely not assume that behavior."And a bit below in the same comments, to this question about PG: "So,what are the options at this point? The assumption was that we canrepeat the fsync (which as you point out is not the case), or shutdown the database and perform recovery from WAL", the same Jeff Laytonseems to agree PANIC is the appropriate response:"Replaying the WAL synchronously sounds like the simplest approachwhen you get an error on fsync. These are uncommon occurrences for themost part, so having to fall back to slow, synchronous error recoverymodes when this occurs is probably what you want to do.".And right after, he confirms the errseq_t patches are about alwaysdetecting this, not more:"The main thing I working on is to better guarantee is that youactually get an error when this occurs rather than silently corruptingyour data. The circumstances where that can occur require somecorner-cases, but I think we need to make sure that it doesn't occur."Jeff's comments in the pull request that merged errseq_t are worthreading as well:https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=088737f44bbf6378745f5b57b035e57ee3dc4750The article above that says the same thing a couple of different ways,ie that writeback failure leaves you with pages that are neitherwritten to disk successfully nor marked dirty.If I'm reading various articles correctly, the situation was evenworse before his errseq_t stuff landed. That fixed cases ofcompletely unreported writeback failures due to sharing of PG_errorfor both writeback and read errors with certain filesystems, but itdoesn't address the clean pages problem.Indeed, that's exactly how I read it as well (opinion formedindependently before reading your sentence above). The errseq_tpatches landed in v4.13 by the way, so very recently.Yeah, I see why you want to PANIC.Indeed. Even doing that leaves question marks about all the kernelversions before v4.13, which at this point is pretty much everythingout there, not even detecting this reliably. This is messy.From:Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>Date:2018-03-29 21:18:14On Fri, Mar 30, 2018 at 5:20 AM, Catalin Iacob wrote:Jeff's comments in the pull request that merged errseq_t are worthreading as well:https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=088737f44bbf6378745f5b57b035e57ee3dc4750Wow. It looks like there may be a separate question of when eachfilesystem adopted this new infrastructure?Yeah, I see why you want to PANIC.Indeed. Even doing that leaves question marks about all the kernelversions before v4.13, which at this point is pretty much everythingout there, not even detecting this reliably. This is messy.The pre-errseq_t problems are beyond our control. There's nothing wecan do about that in userspace (except perhaps abandon OS-buffered IO,a big project). We just need to be aware that this problem exists incertain kernel versions and be grateful to Layton for fixing it.The dropped dirty flag problem is something we can and in my viewshould do something about, whatever we might think about that designchoice. As Andrew Gierth pointed out to me in an off-list chat aboutthis, by the time you've reached this state, both PostgreSQL's bufferand the kernel's buffer are clean and might be reused for anotherblock at any time, so your data might be gone from the known universe-- we don't even have the option to rewrite our buffers in general.Recovery is the only option.Thank you to Craig for chasing this down and +1 for his proposal, on Linux only.From:Anthony Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-03-31 13:24:28On Fri, Mar 30, 2018 at 10:18:14AM +1300, Thomas Munro wrote:Yeah, I see why you want to PANIC.Indeed. Even doing that leaves question marks about all the kernelversions before v4.13, which at this point is pretty much everythingout there, not even detecting this reliably. This is messy.There may still be a way to reliably detect this on older kernelversions from userspace, but it will be messy whatsoever. On EIOerrors, the kernel will not restore the dirty page flags, but itwill flip the error flags on the failed pages. One could mmap()the file in question, obtain the PFNs (via /proc/pid/pagemap)and enumerate those to match the ones with the error flag switchedon (via /proc/kpageflags). This could serve at least as a detectionmechanism, but one could also further use this info to logicallymap the pages that failed IO back to the original file offsets,and potentially retry IO just for those file ranges that coverthe failed pages. Just an idea, not tested.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-03-31 16:13:09On 31 March 2018 at 21:24, Anthony Iliopoulos wrote:On Fri, Mar 30, 2018 at 10:18:14AM +1300, Thomas Munro wrote:Yeah, I see why you want to PANIC.Indeed. Even doing that leaves question marks about all the kernelversions before v4.13, which at this point is pretty much everythingout there, not even detecting this reliably. This is messy.There may still be a way to reliably detect this on older kernelversions from userspace, but it will be messy whatsoever. On EIOerrors, the kernel will not restore the dirty page flags, but itwill flip the error flags on the failed pages. One could mmap()the file in question, obtain the PFNs (via /proc/pid/pagemap)and enumerate those to match the ones with the error flag switchedon (via /proc/kpageflags). This could serve at least as a detectionmechanism, but one could also further use this info to logicallymap the pages that failed IO back to the original file offsets,and potentially retry IO just for those file ranges that coverthe failed pages. Just an idea, not tested.That sounds like a huge amount of complexity, with uncertainty as to howit'll behave kernel-to-kernel, for negligble benefit.I was exploring the idea of doing selective recovery of one relfilenode,based on the assumption that we know the filenode related to the fd thatfailed to fsync(). We could redo only WAL on that relation. But it failsthe same test: it's too complex for a niche case that shouldn't happen inthe first place, so it'll probably have bugs, or grow bugs in bitrot overtime.Remember, if you're on ext4 with errors=remount-ro, you get shut down evenharder than a PANIC. So we should just use the big hammer here.I'll send a patch this week.From:Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>Date:2018-03-31 16:38:12Craig Ringer writes:So we should just use the big hammer here.And bitch, loudly and publicly, about how broken this kernel behavior is.If we make enough of a stink maybe it'll get fixed.From:Michael Paquier <michael(at)paquier(dot)xyz>Date:2018-04-01 00:20:38On Sat, Mar 31, 2018 at 12:38:12PM -0400, Tom Lane wrote:Craig Ringer writes:So we should just use the big hammer here.And bitch, loudly and publicly, about how broken this kernel behavior is.If we make enough of a stink maybe it'll get fixed.That won't fix anything released already, so as per the informationgathered something has to be done anyway. The discussion of this threadis spreading quite a lot actually.Handling things at a low-level looks like a better plan for the backend.Tools like pg_basebackup and pg_dump also issue fsync's on the datacreated, we should do an equivalent for them, with some exit() calls infile_utils.c. As of now failures are logged to stderr but notconsidered fatal.From:Anthony Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-04-01 00:58:22On Sun, Apr 01, 2018 at 12:13:09AM +0800, Craig Ringer wrote:On 31 March 2018 at 21:24, Anthony Iliopoulos <[1]ailiop(at)altatus(dot)com> wrote: On Fri, Mar 30, 2018 at 10:18:14AM +1300, Thomas Munro wrote: > >> Yeah, I see why you want to PANIC. > > > > Indeed. Even doing that leaves question marks about all the kernel > > versions before v4.13, which at this point is pretty much everything > > out there, not even detecting this reliably. This is messy. There may still be a way to reliably detect this on older kernel versions from userspace, but it will be messy whatsoever. On EIO errors, the kernel will not restore the dirty page flags, but it will flip the error flags on the failed pages. One could mmap() the file in question, obtain the PFNs (via /proc/pid/pagemap) and enumerate those to match the ones with the error flag switched on (via /proc/kpageflags). This could serve at least as a detection mechanism, but one could also further use this info to logically map the pages that failed IO back to the original file offsets, and potentially retry IO just for those file ranges that cover the failed pages. Just an idea, not tested.That sounds like a huge amount of complexity, with uncertainty as to how it'll behave kernel-to-kernel, for negligble benefit.Those interfaces have been around since the kernel 2.6 times and arerather stable, but I was merely responding to your original post commentregarding having a way of finding out which page(s) failed. I assumethat indeed there would be no benefit, especially since those errorsare usually not transient (typically they come from hard medium faults),and although a filesystem could theoretically mask the error by allocatinga different logical block, I am not aware of any implementation thatcurrently does that.I was exploring the idea of doing selective recovery of one relfilenode, based on the assumption that we know the filenode related to the fd that failed to fsync(). We could redo only WAL on that relation. But it fails the same test: it's too complex for a niche case that shouldn't happen in the first place, so it'll probably have bugs, or grow bugs in bitrot over time.Fully agree, those cases should be sufficiently rare that a complexand possibly non-maintainable solution is not really warranted.Remember, if you're on ext4 with errors=remount-ro, you get shut down even harder than a PANIC. So we should just use the big hammer here.I am not entirely sure what you mean here, does Pg really treat write()errors as fatal? Also, the kind of errors that ext4 detects with thisoption is at the superblock level and govern metadata rather than actualdata writes (recall that those are buffered anyway, no actual device IOhas to take place at the time of write()).From:Anthony Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-04-01 01:14:46On Sat, Mar 31, 2018 at 12:38:12PM -0400, Tom Lane wrote:Craig Ringer writes:So we should just use the big hammer here.And bitch, loudly and publicly, about how broken this kernel behavior is.If we make enough of a stink maybe it'll get fixed.It is not likely to be fixed (beyond what has been done already with themanpage patches and errseq_t fixes on the reporting level). The issue is,the kernel needs to deal with hard IO errors at that level somehow, andsince those errors typically persist, re-dirtying the pages would notreally solve the problem (unless some filesystem remaps the request to adifferent block, assuming the device is alive). Keeping around dirtypages that cannot possibly be written out is essentially a memory leak,as those pages would stay around even after the application has exited.From:Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>Date:2018-04-01 18:24:51On Fri, Mar 30, 2018 at 10:18 AM, Thomas Munro wrote:... on Linux only.Apparently I was too optimistic. I had looked only at FreeBSD, whichkeeps the page around and dirties it so we can retry, but the otherBSDs apparently don't (FreeBSD changed that in 1999). From what I cantell from the sources below, we have:Linux, OpenBSD, NetBSD: retrying fsync() after EIO liesFreeBSD, Illumos: retrying fsync() after EIO tells the truthMaybe my drive-by assessment of those kernel routines is wrong andsomeone will correct me, but I'm starting to think you might be betterto assume the worst on all systems. Perhaps a GUC that defaults topanicking, so that users on those rare OSes could turn that off? Eventhen I'm not sure if the failure mode will be that great anyway or ifit's worth having two behaviours. Thoughts?http://mail-index.netbsd.org/netbsd-users/2018/03/30/msg020576.htmlhttps://github.com/NetBSD/src/blob/trunk/sys/kern/vfs_bio.c#L1059https://github.com/openbsd/src/blob/master/sys/kern/vfs_bio.c#L867https://github.com/freebsd/freebsd/blob/master/sys/kern/vfs_bio.c#L2631https://github.com/freebsd/freebsd/commit/e4e8fec98ae986357cdc208b04557dba55a59266https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/os/bio.c#L441From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-02 15:03:42On 2 April 2018 at 02:24, Thomas Munro wrote:Maybe my drive-by assessment of those kernel routines is wrong andsomeone will correct me, but I'm starting to think you might be betterto assume the worst on all systems. Perhaps a GUC that defaults topanicking, so that users on those rare OSes could turn that off? Eventhen I'm not sure if the failure mode will be that great anyway or ifit's worth having two behaviours. Thoughts?I see little benefit to not just PANICing unconditionally on EIO, really.It shouldn't happen, and if it does, we want to be pretty conservative andadopt a data-protective approach.I'm rather more worried by doing it on ENOSPC. Which looks like it might benecessary from what I recall finding in my test case + kernel code reading.I really don't want to respond to a possibly-transient ENOSPC by PANICingthe whole server unnecessarily.BTW, the support team at 2ndQ is presently working on two separate issueswhere ENOSPC resulted in DB corruption, though neither of them involve logsof lost page writes. I'm planning on taking some time tomorrow to write atorture tester for Pg's ENOSPC handling and to verify ENOSPC handling inthe test case I linked to in my original StackOverflow post.If this is just an EIO issue then I see no point doing anything other thanPANICing unconditionally.If it's a concern for ENOSPC too, we should try harder to fail more nicelywhenever we possibly can.From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-02 18:13:46Hi,On 2018-04-01 03:14:46 +0200, Anthony Iliopoulos wrote:On Sat, Mar 31, 2018 at 12:38:12PM -0400, Tom Lane wrote:Craig Ringer writes:So we should just use the big hammer here.And bitch, loudly and publicly, about how broken this kernel behavior is.If we make enough of a stink maybe it'll get fixed.It is not likely to be fixed (beyond what has been done already with themanpage patches and errseq_t fixes on the reporting level). The issue is,the kernel needs to deal with hard IO errors at that level somehow, andsince those errors typically persist, re-dirtying the pages would notreally solve the problem (unless some filesystem remaps the request to adifferent block, assuming the device is alive).Throwing away the dirty pages and persisting the error seems a lotmore reasonable. Then provide a fcntl (or whatever) extension that canclear the error status in the few cases that the application that wantsto gracefully deal with the case.Keeping around dirtypages that cannot possibly be written out is essentially a memory leak,as those pages would stay around even after the application has exited.Why do dirty pages need to be kept around in the case of persistenterrors? I don't think the lack of automatic recovery in that case iswhat anybody is complaining about. It's that the error goes away andthere's no reasonable way to separate out such an error from somepotential transient errors.From:Anthony Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-04-02 18:53:20On Mon, Apr 02, 2018 at 11:13:46AM -0700, Andres Freund wrote:Hi,On 2018-04-01 03:14:46 +0200, Anthony Iliopoulos wrote:On Sat, Mar 31, 2018 at 12:38:12PM -0400, Tom Lane wrote:Craig Ringer writes:So we should just use the big hammer here.And bitch, loudly and publicly, about how broken this kernel behavior is.If we make enough of a stink maybe it'll get fixed.It is not likely to be fixed (beyond what has been done already with themanpage patches and errseq_t fixes on the reporting level). The issue is,the kernel needs to deal with hard IO errors at that level somehow, andsince those errors typically persist, re-dirtying the pages would notreally solve the problem (unless some filesystem remaps the request to adifferent block, assuming the device is alive).Throwing away the dirty pages and persisting the error seems a lotmore reasonable. Then provide a fcntl (or whatever) extension that canclear the error status in the few cases that the application that wantsto gracefully deal with the case.Given precisely that the dirty pages which cannot been written-out arepractically thrown away, the semantics of fsync() (after the 4.13 fixes)are essentially correct: the first call indicates that a writeback errorindeed occurred, while subsequent calls have no reason to indicate an error(assuming no other errors occurred in the meantime).The error reporting is thus consistent with the intended semantics (whichare sadly not properly documented). Repeated calls to fsync() simply do notimply that the kernel will retry to writeback the previously-failed pages,so the application needs to be aware of that. Persisting the error at thefsync() level would essentially mean moving application policy into thekernel.From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-02 19:32:45On 2018-04-02 20:53:20 +0200, Anthony Iliopoulos wrote:On Mon, Apr 02, 2018 at 11:13:46AM -0700, Andres Freund wrote:Throwing away the dirty pages and persisting the error seems a lotmore reasonable. Then provide a fcntl (or whatever) extension that canclear the error status in the few cases that the application that wantsto gracefully deal with the case.Given precisely that the dirty pages which cannot been written-out arepractically thrown away, the semantics of fsync() (after the 4.13 fixes)are essentially correct: the first call indicates that a writeback errorindeed occurred, while subsequent calls have no reason to indicate an error(assuming no other errors occurred in the meantime).Meh^2."no reason" - except that there's absolutely no way to know what statethe data is in. And that your application needs explicit handling ofsuch failures. And that one FD might be used in a lots of differentparts of the application, that fsyncs in one part of the applicationmight be an ok failure, and in another not. Requiring explicit actionsto acknowledge "we've thrown away your data for unknown reason" seemsentirely reasonable.The error reporting is thus consistent with the intended semantics (whichare sadly not properly documented). Repeated calls to fsync() simply do notimply that the kernel will retry to writeback the previously-failed pages,so the application needs to be aware of that.Which isn't what I've suggested.Persisting the error at the fsync() level would essentially meanmoving application policy into the kernel.Meh.From:Anthony Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-04-02 20:38:06On Mon, Apr 02, 2018 at 12:32:45PM -0700, Andres Freund wrote:On 2018-04-02 20:53:20 +0200, Anthony Iliopoulos wrote:On Mon, Apr 02, 2018 at 11:13:46AM -0700, Andres Freund wrote:Throwing away the dirty pages and persisting the error seems a lotmore reasonable. Then provide a fcntl (or whatever) extension that canclear the error status in the few cases that the application that wantsto gracefully deal with the case.Given precisely that the dirty pages which cannot been written-out arepractically thrown away, the semantics of fsync() (after the 4.13 fixes)are essentially correct: the first call indicates that a writeback errorindeed occurred, while subsequent calls have no reason to indicate an error(assuming no other errors occurred in the meantime).Meh^2."no reason" - except that there's absolutely no way to know what statethe data is in. And that your application needs explicit handling ofsuch failures. And that one FD might be used in a lots of differentparts of the application, that fsyncs in one part of the applicationmight be an ok failure, and in another not. Requiring explicit actionsto acknowledge "we've thrown away your data for unknown reason" seemsentirely reasonable.As long as fsync() indicates error on first invocation, the applicationis fully aware that between this point of time and the last call to fsync()data has been lost. Persisting this error any further does not change thisor add any new info - on the contrary it adds confusion as subsequent write()sand fsync()s on other pages can succeed, but will be reported as failures.The application will need to deal with that first error irrespective ofsubsequent return codes from fsync(). Conceptually every fsync() invocationdemarcates an epoch for which it reports potential errors, so the callerneeds to take responsibility for that particular epoch.Callers that are not affected by the potential outcome of fsync() anddo not react on errors, have no reason for calling it in the first place(and thus masking failure from subsequent callers that may indeed care).From:Stephen Frost <sfrost(at)snowman(dot)net>Date:2018-04-02 20:58:08Greetings,Anthony Iliopoulos (ailiop(at)altatus(dot)com) wrote:On Mon, Apr 02, 2018 at 12:32:45PM -0700, Andres Freund wrote:On 2018-04-02 20:53:20 +0200, Anthony Iliopoulos wrote:On Mon, Apr 02, 2018 at 11:13:46AM -0700, Andres Freund wrote:Throwing away the dirty pages and persisting the error seems a lotmore reasonable. Then provide a fcntl (or whatever) extension that canclear the error status in the few cases that the application that wantsto gracefully deal with the case.Given precisely that the dirty pages which cannot been written-out arepractically thrown away, the semantics of fsync() (after the 4.13 fixes)are essentially correct: the first call indicates that a writeback errorindeed occurred, while subsequent calls have no reason to indicate an error(assuming no other errors occurred in the meantime).Meh^2."no reason" - except that there's absolutely no way to know what statethe data is in. And that your application needs explicit handling ofsuch failures. And that one FD might be used in a lots of differentparts of the application, that fsyncs in one part of the applicationmight be an ok failure, and in another not. Requiring explicit actionsto acknowledge "we've thrown away your data for unknown reason" seemsentirely reasonable.As long as fsync() indicates error on first invocation, the applicationis fully aware that between this point of time and the last call to fsync()data has been lost. Persisting this error any further does not change thisor add any new info - on the contrary it adds confusion as subsequent write()sand fsync()s on other pages can succeed, but will be reported as failures.fsync() doesn't reflect the status of given pages, however, it reflectsthe status of the file descriptor, and as such the file, on which it'scalled. This notion that fsync() is actually only responsible for thechanges which were made to a file since the last fsync() call is purefoolishness. If we were able to pass a list of pages or data ranges tofsync() for it to verify they're on disk then perhaps things would bedifferent, but we can't, all we can do is ask to "please flush all thedirty pages associated with this file descriptor, which represents thisfile we opened, to disk, and let us know if you were successful."Give us a way to ask "are these specific pages written out to persistantstorage?" and we would certainly be happy to use it, and to repeatedlytry to flush out pages which weren't synced to disk due to sometransient error, and to track those cases and make sure that we don'tincorrectly assume that they've been transferred to persistent storage.The application will need to deal with that first error irrespective ofsubsequent return codes from fsync(). Conceptually every fsync() invocationdemarcates an epoch for which it reports potential errors, so the callerneeds to take responsibility for that particular epoch.We do deal with that error- by realizing that it failed and laterretrying the fsync(), which is when we get back an "all good!everything with this file descriptor you've opened is sync'd!" andhappily expect that to be truth, when, in reality, it's an unfortunatelie and there are still pages associated with that file descriptor whichare, in reality, dirty and not sync'd to disk.Consider two independent programs where the first one writes to a fileand then calls the second one whose job it is to go out and fsync(),perhaps async from the first, those files. Is the second programsupposed to go write to each page that the first one wrote to, in orderto ensure that all the dirty bits are set so that the fsync() willactually return if all the dirty pages are written?Callers that are not affected by the potential outcome of fsync() anddo not react on errors, have no reason for calling it in the first place(and thus masking failure from subsequent callers that may indeed care).Reacting on an error from an fsync() call could, based on how it'sdocumented and actually implemented in other OS's, mean "run anotherfsync() to see if the error has resolved itself." Requiring that tomean "you have to go dirty all of the pages you previously dirtied toactually get a subsequent fsync() to do anything" is really just notreasonable- a given program may have no idea what was written topreviously nor any particular reason to need to know, on the expectationthat the fsync() call will flush any dirty pages, as it's documented todo.From:Anthony Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-04-02 23:05:44Hi Stephen,On Mon, Apr 02, 2018 at 04:58:08PM -0400, Stephen Frost wrote:fsync() doesn't reflect the status of given pages, however, it reflectsthe status of the file descriptor, and as such the file, on which it'scalled. This notion that fsync() is actually only responsible for thechanges which were made to a file since the last fsync() call is purefoolishness. If we were able to pass a list of pages or data ranges tofsync() for it to verify they're on disk then perhaps things would bedifferent, but we can't, all we can do is ask to "please flush all thedirty pages associated with this file descriptor, which represents thisfile we opened, to disk, and let us know if you were successful."Give us a way to ask "are these specific pages written out to persistantstorage?" and we would certainly be happy to use it, and to repeatedlytry to flush out pages which weren't synced to disk due to sometransient error, and to track those cases and make sure that we don'tincorrectly assume that they've been transferred to persistent storage.Indeed fsync() is simply a rather blunt instrument and a narrow legacyinterface but further changing its established semantics (no matter howunreasonable they may be) is probably not the way to go.Would using sync_file_range() be helpful? Potential errors would onlyapply to pages that cover the requested file ranges. There are a fewcaveats though:(a) it still messes with the top-level error reporting so mixing itwith callers that use fsync() and do care about errors will producethe same issue (clearing the error status).(b) the error-reporting granularity is coarse (failure reporting appliesto the entire requested range so you still don't know which particularpages/file sub-ranges failed writeback)(c) the same "report and forget" semantics apply to repeated invocationsof the sync_file_range() call, so again action will need to be takenupon first error encountered for the particular ranges.The application will need to deal with that first error irrespective ofsubsequent return codes from fsync(). Conceptually every fsync() invocationdemarcates an epoch for which it reports potential errors, so the callerneeds to take responsibility for that particular epoch.We do deal with that error- by realizing that it failed and laterretrying the fsync(), which is when we get back an "all good!everything with this file descriptor you've opened is sync'd!" andhappily expect that to be truth, when, in reality, it's an unfortunatelie and there are still pages associated with that file descriptor whichare, in reality, dirty and not sync'd to disk.It really turns out that this is not how the fsync() semantics workthough, exactly because the nature of the errors: even if the kernelretained the dirty bits on the failed pages, retrying persisting themon the same disk location would simply fail. Instead the kernel optsfor marking those pages clean (since there is no other recoverystrategy), and reporting once to the caller who can potentially dealwith it in some manner. It is sadly a bad and undocumented convention.Consider two independent programs where the first one writes to a fileand then calls the second one whose job it is to go out and fsync(),perhaps async from the first, those files. Is the second programsupposed to go write to each page that the first one wrote to, in orderto ensure that all the dirty bits are set so that the fsync() willactually return if all the dirty pages are written?I think what you have in mind are the semantics of sync() ratherthan fsync(), but as long as an application needs to ensure dataare persisted to storage, it needs to retain those data in its heapuntil fsync() is successful instead of discarding them and relyingon the kernel after write(). The pattern should be roughly like:write() -> fsync() -> free(), rather than write() -> free() -> fsync().For example, if a partition gets full upon fsync(), then the applicationhas a chance to persist the data in a different location, whilethe kernel cannot possibly make this decision and recover.Callers that are not affected by the potential outcome of fsync() anddo not react on errors, have no reason for calling it in the first place(and thus masking failure from subsequent callers that may indeed care).Reacting on an error from an fsync() call could, based on how it'sdocumented and actually implemented in other OS's, mean "run anotherfsync() to see if the error has resolved itself." Requiring that tomean "you have to go dirty all of the pages you previously dirtied toactually get a subsequent fsync() to do anything" is really just notreasonable- a given program may have no idea what was written topreviously nor any particular reason to need to know, on the expectationthat the fsync() call will flush any dirty pages, as it's documented todo.I think we are conflating a few issues here: having the OS kernel beingresponsible for error recovery (so that subsequent fsync() would fixthe problems) is one. This clearly is a design which most kernels havenot really adopted for reasons outlined above (although having the FSlayer recovering from hard errors transparently is open for discussionfrom what it seems [1]). Now, there is the issue of granularity oferror reporting: userspace could benefit from a fine-grained indicationof failed pages (or file ranges). Another issue is that of reportingsemantics (report and clear), which is also a design choice made toavoid having higher-resolution error tracking and the correspondingmemory overheads [1].[1] https://lwn.net/Articles/718734/From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-02 23:23:24On 2018-04-03 01:05:44 +0200, Anthony Iliopoulos wrote:Would using sync_file_range() be helpful? Potential errors would onlyapply to pages that cover the requested file ranges. There are a fewcaveats though:To quote sync_file_range(2): Warning This system call is extremely dangerous and should not be used in portable programs. None of these operations writes out the file's metadata. Therefore, unless the application is strictly performing overwrites of already-instantiated disk blocks, there are no guarantees that the data will be available after a crash. There is no user interface to know if a write is purely an over write. On filesystems using copy-on-write semantics (e.g., btrfs) an overwrite of existing allocated blocks is impossible. When writing into preallocated space, many filesystems also require calls into the block allocator, which this system call does not sync out to disk. This system call does not flush disk write caches and thus does not provide any data integrity on systems with volatile disk write caches.Given the lack of metadata safety that seems entirely a no go. We usesfr(2), but only to force the kernel's hand around writing back earlierwithout throwing away cache contents.The application will need to deal with that first error irrespective ofsubsequent return codes from fsync(). Conceptually every fsync() invocationdemarcates an epoch for which it reports potential errors, so the callerneeds to take responsibility for that particular epoch.We do deal with that error- by realizing that it failed and laterretrying the fsync(), which is when we get back an "all good!everything with this file descriptor you've opened is sync'd!" andhappily expect that to be truth, when, in reality, it's an unfortunatelie and there are still pages associated with that file descriptor whichare, in reality, dirty and not sync'd to disk.It really turns out that this is not how the fsync() semantics workthoughExcept on freebsd and solaris, and perhaps others., exactly because the nature of the errors: even if the kernelretained the dirty bits on the failed pages, retrying persisting themon the same disk location would simply fail.That's not guaranteed at all, think NFS.Instead the kernel opts for marking those pages clean (since there isno other recovery strategy), and reporting once to the caller who canpotentially deal with it in some manner. It is sadly a bad andundocumented convention.It's broken behaviour justified post facto with the only rational thatwas available, which explains why it's so unconvincing. You could justsay "this ship has sailed, and it's to onerous to change because xxx"and this'd be a done deal. But claiming this is reasonable behaviour isridiculous.Again, you could just continue to error for this fd and still throw awaythe data.Consider two independent programs where the first one writes to a fileand then calls the second one whose job it is to go out and fsync(),perhaps async from the first, those files. Is the second programsupposed to go write to each page that the first one wrote to, in orderto ensure that all the dirty bits are set so that the fsync() willactually return if all the dirty pages are written?I think what you have in mind are the semantics of sync() ratherthan fsync()If you open the same file with two fds, and write with one, and fsyncwith another that's definitely supposed to work. And sync() isn't arealistic replacement in any sort of way because it's obviouslysystemwide, and thus entirely and completely unsuitable. Nor does ithave any sort of better error reporting behaviour, does it?From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-02 23:27:35On 3 April 2018 at 07:05, Anthony Iliopoulos wrote:Hi Stephen,On Mon, Apr 02, 2018 at 04:58:08PM -0400, Stephen Frost wrote:fsync() doesn't reflect the status of given pages, however, it reflectsthe status of the file descriptor, and as such the file, on which it'scalled. This notion that fsync() is actually only responsible for thechanges which were made to a file since the last fsync() call is purefoolishness. If we were able to pass a list of pages or data ranges tofsync() for it to verify they're on disk then perhaps things would bedifferent, but we can't, all we can do is ask to "please flush all thedirty pages associated with this file descriptor, which represents thisfile we opened, to disk, and let us know if you were successful."Give us a way to ask "are these specific pages written out to persistantstorage?" and we would certainly be happy to use it, and to repeatedlytry to flush out pages which weren't synced to disk due to sometransient error, and to track those cases and make sure that we don'tincorrectly assume that they've been transferred to persistent storage.Indeed fsync() is simply a rather blunt instrument and a narrow legacyinterface but further changing its established semantics (no matter howunreasonable they may be) is probably not the way to go.They're undocumented and extremely surprising semantics that are arguably aviolation of the POSIX spec for fsync(), or at least a surprisinginterpretation of it.So I don't buy this argument.It really turns out that this is not how the fsync() semantics workthough, exactly because the nature of the errors: even if the kernelretained the dirty bits on the failed pages, retrying persisting themon the same disk location would simply fail.might simply fail.It depends on why the error ocurred.I originally identified this behaviour on a multipath system. Multipathdefaults to "throw the writes away, nobody really cares anyway" on error.It seems to figure a higher level will retry, or the application willreceive the error and retry.(See no_path_retry in multipath config. AFAICS the default is insanelydangerous and only suitable for specialist apps that understand the quirks;you should use no_path_retry=queue).Instead the kernel optsfor marking those pages clean (since there is no other recoverystrategy),and reporting once to the caller who can potentially dealwith it in some manner. It is sadly a bad and undocumented convention.It could mark the FD.It's not just undocumented, it's a slightly creative interpretation of thePOSIX spec for fsync.Consider two independent programs where the first one writes to a fileand then calls the second one whose job it is to go out and fsync(),perhaps async from the first, those files. Is the second programsupposed to go write to each page that the first one wrote to, in orderto ensure that all the dirty bits are set so that the fsync() willactually return if all the dirty pages are written?I think what you have in mind are the semantics of sync() ratherthan fsync(), but as long as an application needs to ensure dataare persisted to storage, it needs to retain those data in its heapuntil fsync() is successful instead of discarding them and relyingon the kernel after write().This is almost exactly what we tell application authors using PostgreSQL:the data isn't written until you receive a successful commit confirmation,so you'd better not forget it.We provide applications with clear boundaries so they can know exactlywhat was, and was not, written. I guess the argument from the kernel is thesame is true: whatever was written since the last successful fsync ispotentially lost and must be redone.But the fsync behaviour is utterly undocumented and dubiously standard.I think we are conflating a few issues here: having the OS kernel beingresponsible for error recovery (so that subsequent fsync() would fixthe problems) is one. This clearly is a design which most kernels havenot really adopted for reasons outlined above[citation needed]What do other major platforms do here? The post above suggests it's a bitof a mix of behaviours.Now, there is the issue of granularity oferror reporting: userspace could benefit from a fine-grained indicationof failed pages (or file ranges).Yep. I looked at AIO in the hopes that, if we used AIO, we'd be able to mapa sync failure back to an individual AIO write.But it seems AIO just adds more problems and fixes none. Flush behaviourwith AIO from what I can tell is inconsistent version to version andgenerally unhelpful. The kernel should really report such sync failuresback to the app on its AIO write mapping, but it seems nothing of the sorthappens.From:Christophe Pettus <xof(at)thebuild(dot)com>Date:2018-04-03 00:03:39On Apr 2, 2018, at 16:27, Craig Ringer wrote:They're undocumented and extremely surprising semantics that are arguably a violation of the POSIX spec for fsync(), or at least a surprising interpretation of it.Even accepting that (I personally go with surprising over violation, as if my vote counted), it is highly unlikely that we will convince every kernel team to declare "What fools we've been!" and push a change... and even if they did, PostgreSQL can look forward to many years of running on kernels with the broken semantics. Given that, I think the PANIC option is the soundest one, as unappetizing as it is.From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-03 00:05:09On April 2, 2018 5:03:39 PM PDT, Christophe Pettus wrote:On Apr 2, 2018, at 16:27, Craig Ringer wrote:They're undocumented and extremely surprising semantics that arearguably a violation of the POSIX spec for fsync(), or at least asurprising interpretation of it.Even accepting that (I personally go with surprising over violation, asif my vote counted), it is highly unlikely that we will convince everykernel team to declare "What fools we've been!" and push a change...and even if they did, PostgreSQL can look forward to many years ofrunning on kernels with the broken semantics. Given that, I think thePANIC option is the soundest one, as unappetizing as it is.Don't we pretty much already have agreement in that? And Craig is the main proponent of it?From:Christophe Pettus <xof(at)thebuild(dot)com>Date:2018-04-03 00:07:41On Apr 2, 2018, at 17:05, Andres Freund wrote:Don't we pretty much already have agreement in that? And Craig is the main proponent of it?For sure on the second sentence; the first was not clear to me.From:Peter Geoghegan <pg(at)bowt(dot)ie>Date:2018-04-03 00:48:00On Mon, Apr 2, 2018 at 5:05 PM, Andres Freund wrote:Even accepting that (I personally go with surprising over violation, asif my vote counted), it is highly unlikely that we will convince everykernel team to declare "What fools we've been!" and push a change...and even if they did, PostgreSQL can look forward to many years ofrunning on kernels with the broken semantics. Given that, I think thePANIC option is the soundest one, as unappetizing as it is.Don't we pretty much already have agreement in that? And Craig is the main proponent of it?I wonder how bad it will be in practice if we PANIC. Craig said "Thisisn't as bad as it seems because AFAICS fsync only returns EIO incases where we should be stopping the world anyway, and many FSes willdo that for us". It would be nice to get more information on that.From:Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>Date:2018-04-03 01:29:28On Tue, Apr 3, 2018 at 3:03 AM, Craig Ringer wrote:I see little benefit to not just PANICing unconditionally on EIO, really. Itshouldn't happen, and if it does, we want to be pretty conservative andadopt a data-protective approach.I'm rather more worried by doing it on ENOSPC. Which looks like it might benecessary from what I recall finding in my test case + kernel code reading.I really don't want to respond to a possibly-transient ENOSPC by PANICingthe whole server unnecessarily.Yeah, it'd be nice to give an administrator the chance to free up somedisk space after ENOSPC is reported, and stay up. Running out ofspace really shouldn't take down the database without warning! Thequestion is whether the data remains in cache and marked dirty, sothat retrying is a safe option (since it's potentially gone from ourown buffers, so if the OS doesn't have it the only place yourcommitted data can definitely still be found is the WAL... recoverytime). Who can tell us? Do we need a per-filesystem answer? Delayedallocation is a somewhat filesystem-specific thing, so maybe.Interestingly, there don't seem to be many operating systems that canreport ENOSPC from fsync(), based on a quick scan through somedocumentation:POSIX, AIX, HP-UX, FreeBSD, OpenBSD, NetBSD: noIllumos/Solaris, Linux, macOS: yesI don't know if macOS really means it or not; it just tells you to seethe errors for read(2) and write(2). By the way, speaking of macOS, Iwas curious to see if the common BSD heritage would show here. Yeah,somewhat. It doesn't appear to keep buffers on writeback error, ifthis is the right code<a href="though it could be handling it somewhereelse for all I know">1.[1] https://github.com/apple/darwin-xnu/blob/master/bsd/vfs/vfs_bio.c#L2695From:Robert Haas <robertmhaas(at)gmail(dot)com>Date:2018-04-03 02:54:26On Mon, Apr 2, 2018 at 2:53 PM, Anthony Iliopoulos wrote:Given precisely that the dirty pages which cannot been written-out arepractically thrown away, the semantics of fsync() (after the 4.13 fixes)are essentially correct: the first call indicates that a writeback errorindeed occurred, while subsequent calls have no reason to indicate an error(assuming no other errors occurred in the meantime).Like other people here, I think this is 100% unreasonable, startingwith "the dirty pages which cannot been written out are practicallythrown away". Who decided that was OK, and on the basis of whatwording in what specification? I think it's always unreasonable tothrow away the user's data. If the writes are going to fail, then letthem keep on failing every time. That wouldn't cause any data loss,because we'd never be able to checkpoint, and eventually the userwould have to kill the server uncleanly, and that would triggerrecovery.Also, this really does make it impossible to write reliable programs.Imagine that, while the server is running, somebody runs a programwhich opens a file in the data directory, calls fsync() on it, andcloses it. If the fsync() fails, postgres is now borked and has noway of being aware of the problem. If we knew, we could PANIC, butwe'll never find out, because the unrelated process ate the error.This is exactly the sort of ill-considered behavior that makes fcntl()locking nearly useless.Even leaving that aside, a PANIC means a prolonged outage on aprolonged system - it could easily take tens of minutes or longer torun recovery. So saying "oh, just do that" is not really an answer.Sure, we can do it, but it's like trying to lose weight byintentionally eating a tapeworm. Now, it's possible to shorten thecheckpoint_timeout so that recovery runs faster, but then performancedrops because data has to be fsync()'d more often instead of gettingbuffered in the OS cache for the maximum possible time. We could alsododge this issue in another way: suppose that when we write a pageout, we don't consider it really written until fsync() succeeds. Thenwe wouldn't need to PANIC if an fsync() fails; we could just re-writethe page. Unfortunately, this would also be terrible for performance,for pretty much the same reasons: letting the OS cache absorb lots ofdirty blocks and do write-combining is necessary for goodperformance.The error reporting is thus consistent with the intended semantics (whichare sadly not properly documented). Repeated calls to fsync() simply do notimply that the kernel will retry to writeback the previously-failed pages,so the application needs to be aware of that. Persisting the error at thefsync() level would essentially mean moving application policy into thekernel.I might accept this argument if I accepted that it was OK to decidethat an fsync() failure means you can forget that the write() everhappened in the first place, but it's hard to imagine an applicationthat wants that behavior. If the application didn't care aboutwhether the bytes really got to disk or not, it would not have calledfsync() in the first place. If it does care, reporting the error onlyonce is never an improvement.From:Peter Geoghegan <pg(at)bowt(dot)ie>Date:2018-04-03 03:45:30On Mon, Apr 2, 2018 at 7:54 PM, Robert Haas wrote:Also, this really does make it impossible to write reliable programs.Imagine that, while the server is running, somebody runs a programwhich opens a file in the data directory, calls fsync() on it, andcloses it. If the fsync() fails, postgres is now borked and has noway of being aware of the problem. If we knew, we could PANIC, butwe'll never find out, because the unrelated process ate the error.This is exactly the sort of ill-considered behavior that makes fcntl()locking nearly useless.I fear that the conventional wisdom from the Kernel people is now "youshould be using O_DIRECT for granular control". The LWN articleThomas linked (https://lwn.net/Articles/718734) cites Ted Ts'o:"Monakhov asked why a counter was needed; Layton said it was to handlemultiple overlapping writebacks. Effectively, the counter would recordwhether a writeback had failed since the file was opened or since thelast fsync(). Ts'o said that should be fine; applications that wantmore information should use O_DIRECT. For most applications, knowledgethat an error occurred somewhere in the file is all that is necessary;applications that require better granularity already use O_DIRECT."From:Anthony Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-04-03 10:35:39Hi Robert,On Mon, Apr 02, 2018 at 10:54:26PM -0400, Robert Haas wrote:On Mon, Apr 2, 2018 at 2:53 PM, Anthony Iliopoulos wrote:Given precisely that the dirty pages which cannot been written-out arepractically thrown away, the semantics of fsync() (after the 4.13 fixes)are essentially correct: the first call indicates that a writeback errorindeed occurred, while subsequent calls have no reason to indicate an error(assuming no other errors occurred in the meantime).Like other people here, I think this is 100% unreasonable, startingwith "the dirty pages which cannot been written out are practicallythrown away". Who decided that was OK, and on the basis of whatwording in what specification? I think it's always unreasonable toIf you insist on strict conformance to POSIX, indeed the linuxglibc configuration and associated manpage are probably wrong instating that _POSIX_SYNCHRONIZED_IO is supported. The implementationmatches that of the flexibility allowed by not supporting SIO.There's a long history of brokenness between linux and posix,and I think there was never an intention of conforming to thestandard.throw away the user's data. If the writes are going to fail, then letthem keep on failing every time. That wouldn't cause any data loss,because we'd never be able to checkpoint, and eventually the userwould have to kill the server uncleanly, and that would triggerrecovery.I believe (as tried to explain earlier) there is a certain assumptionbeing made that the writer and original owner of data is responsiblefor dealing with potential errors in order to avoid data loss (whichshould be only of interest to the original writer anyway). It wouldbe very questionable for the interface to persist the error whilesubsequent writes and fsyncs to different offsets may as well go through.Another process may need to write into the file and fsync, while beingunaware of those newly introduced semantics is now faced with EIObecause some unrelated previous process failed some earlier writesand did not bother to clear the error for those writes. In a similarscenario where the second process is aware of the new semantics, it wouldnaturally go ahead and clear the global error in order to proceedwith its own write()+fsync(), which would essentially amount to thesame problematic semantics you have now.Also, this really does make it impossible to write reliable programs.Imagine that, while the server is running, somebody runs a programwhich opens a file in the data directory, calls fsync() on it, andcloses it. If the fsync() fails, postgres is now borked and has noway of being aware of the problem. If we knew, we could PANIC, butwe'll never find out, because the unrelated process ate the error.This is exactly the sort of ill-considered behavior that makes fcntl()locking nearly useless.Fully agree, and the errseq_t fixes have dealt exactly with the issueof making sure that the error is reported to all file descriptors thathappen to be open at the time of error. But I think one would have ahard time defending a modification to the kernel where this is furtherextended to cover cases where:process A does write() on some file offset which fails writeback,fsync() gets EIO and exit()s.process B does write() on some other offset which succeeds writeback,but fsync() gets EIO due to (uncleared) failures of earlier process.This would be a highly user-visible change of semantics from edge-triggered to level-triggered behavior.dodge this issue in another way: suppose that when we write a pageout, we don't consider it really written until fsync() succeeds. ThenThat's the only way to think about fsync() guarantees unless youare on a kernel that keeps retrying to persist dirty pages. Assumingsuch a model, after repeated and unrecoverable hard failures theprocess would have to explicitly inform the kernel to drop the dirtypages. All the process could do at that point is read back to userspacethe dirty/failed pages and attempt to rewrite them at a different place(which is current possible too). Most applications would not botherthough to inform the kernel and drop the permanently failed pages;and thus someone eventually would hit the case that a large amountof failed writeback pages are running his server out of memory,at which point people will complain that those semantics are completelyunreasonable.we wouldn't need to PANIC if an fsync() fails; we could just re-writethe page. Unfortunately, this would also be terrible for performance,for pretty much the same reasons: letting the OS cache absorb lots ofdirty blocks and do write-combining is necessary for goodperformance.Not sure I understand this case. The application may indeed re-writea bunch of pages that have failed and proceed with fsync(). The kernelwill deal with combining the writeback of all the re-written pages. Butfurther the necessity of combining for performance really depends onthe exact storage medium. At the point you start caring aboutwrite-combining, the kernel community will naturally redirect you touse DIRECT_IO.The error reporting is thus consistent with the intended semantics (whichare sadly not properly documented). Repeated calls to fsync() simply do notimply that the kernel will retry to writeback the previously-failed pages,so the application needs to be aware of that. Persisting the error at thefsync() level would essentially mean moving application policy into thekernel.I might accept this argument if I accepted that it was OK to decidethat an fsync() failure means you can forget that the write() everhappened in the first place, but it's hard to imagine an applicationthat wants that behavior. If the application didn't care aboutwhether the bytes really got to disk or not, it would not have calledfsync() in the first place. If it does care, reporting the error onlyonce is never an improvement.Again, conflating two separate issues, that of buffering and retryingfailed pages and that of error reporting. Yes it would be convenientfor applications not to have to care at all about recovery of failedwrite-backs, but at some point they would have to face this issue oneway or another (I am assuming we are always talking about hard failures,other kinds of failures are probably already being dealt with transparentlyat the kernel level).As for the reporting, it is also unreasonable to effectively signaland persist an error on a file-wide granularity while it pertainsto subsets of that file and other writes can go through, but I amrepeating myself.I suppose that if the check-and-clear semantics are problematic forPg, one could suggest a kernel patch that opts-in to a level-triggeredreporting of fsync() on a per-descriptor basis, which seems to benon-intrusive and probably sufficient to cover your expected use-case.From:Greg Stark <stark(at)mit(dot)edu>Date:2018-04-03 11:26:05On 3 April 2018 at 11:35, Anthony Iliopoulos wrote:Hi Robert,Fully agree, and the errseq_t fixes have dealt exactly with the issueof making sure that the error is reported to all file descriptors thathappen to be open at the time of error. But I think one would have ahard time defending a modification to the kernel where this is furtherextended to cover cases where:process A does write() on some file offset which fails writeback,fsync() gets EIO and exit()s.process B does write() on some other offset which succeeds writeback,but fsync() gets EIO due to (uncleared) failures of earlier process.Surely that's exactly what process B would want? If it calls fsync andgets a success and later finds out that the file is corrupt and didn'tmatch what was in memory it's not going to be happy.This seems like an attempt to co-opt fsync for a new and differentpurpose for which it's poorly designed. It's not an async errorreporting mechanism for writes. It would be useless as that as anyprocess could come along and open your file and eat the errors forwrites you performed. An async error reporting mechanism would have todocument which writes it was giving errors for and give you ways tocontrol that.The semantics described here are useless for everyone. For a programneeding to know the error status of the writes it executed, it doesn'tknow which writes are included in which fsync call. For a programusing fsync for its original intended purpose of guaranteeing that theall writes are synced to disk it no longer has any guarantee at all.This would be a highly user-visible change of semantics from edge-triggered to level-triggered behavior.It was always documented as level-triggered. This edge-triggeredconcept is a completely surprise to application writers.From:Anthony Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-04-03 13:36:47On Tue, Apr 03, 2018 at 12:26:05PM +0100, Greg Stark wrote:On 3 April 2018 at 11:35, Anthony Iliopoulos wrote:Hi Robert,Fully agree, and the errseq_t fixes have dealt exactly with the issueof making sure that the error is reported to all file descriptors thathappen to be open at the time of error. But I think one would have ahard time defending a modification to the kernel where this is furtherextended to cover cases where:process A does write() on some file offset which fails writeback,fsync() gets EIO and exit()s.process B does write() on some other offset which succeeds writeback,but fsync() gets EIO due to (uncleared) failures of earlier process.Surely that's exactly what process B would want? If it calls fsync andgets a success and later finds out that the file is corrupt and didn'tmatch what was in memory it's not going to be happy.You can't possibly make this assumption. Process B may be readingand writing to completely disjoint regions from those of process A,and as such not really caring about earlier failures, only wantingto ensure its own writes go all the way through. But even if it didcare, the file interfaces make no transactional guarantees. Evenwithout fsync() there is nothing preventing process B from readingdirty pages from process A, and based on their content proceed toto its own business and write/persist new data, while process Afurther modifies the not-yet-flushed pages in-memory before flushing.In this case you'd need explicit synchronization/locking betweenthe processes anyway, so why would fsync() be an exception?This seems like an attempt to co-opt fsync for a new and differentpurpose for which it's poorly designed. It's not an async errorreporting mechanism for writes. It would be useless as that as anyprocess could come along and open your file and eat the errors forwrites you performed. An async error reporting mechanism would have todocument which writes it was giving errors for and give you ways tocontrol that.The errseq_t fixes deal with that; errors will be reported to anyprocess that has an open fd, irrespective to who is the actual callerof the fsync() that may have induced errors. This is anyway requiredas the kernel may evict dirty pages on its own by doing writeback andas such there needs to be a way to report errors on all open fds.The semantics described here are useless for everyone. For a programneeding to know the error status of the writes it executed, it doesn'tknow which writes are included in which fsync call. For a programIf EIO persists between invocations until explicitly cleared, a processcannot possibly make any decision as to if it should clear the errorand proceed or some other process will need to leverage that withoutcoordination, or which writes actually failed for that matter.We would be back to the case of requiring explicit synchronizationbetween processes that care about this, in which case the processesmay as well synchronize over calling fsync() in the first place.Having an opt-in persisting EIO per-fd would practically be a formof "contract" between "cooperating" processes anyway.But instead of deconstructing and debating the semantics of thecurrent mechanism, why not come up with the ideal desired form oferror reporting/tracking granularity etc., and see how this may befitted into kernels as a new interface.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-03 14:29:10On 3 April 2018 at 10:54, Robert Haas wrote:I think it's always unreasonable tothrow away the user's data.Well, we do that. If a txn aborts, all writes in the txn are discarded.I think that's perfectly reasonable. Though we also promise an all ornothing effect, we make exceptions even there.The FS doesn't offer transactional semantics, but the fsync behaviour canbe interpreted kind of similarly.I don't agree with it, but I don't think it's as wholly unreasonable asall that. I think leaving it undocumented is absolutely gobsmacking, andit's dubious at best, but it's not totally insane.If the writes are going to fail, then letthem keep on failing every time.Like we do, where we require an explicit rollback.But POSIX may pose issues there, it doesn't really define any interface forthat AFAIK. Unless you expect the app to close() and re-open() the file.Replacing one nonstandard issue with another may not be a win.That wouldn't cause any data loss,because we'd never be able to checkpoint, and eventually the userwould have to kill the server uncleanly, and that would triggerrecovery.Yep. That's what I expected to happen on unrecoverable I/O errors. Because,y'know, unrecoverable.I was stunned to learn it's not so. And I'm even more amazed to learn thatext4's errors=remount-ro apparently doesn't concern its self with mere userdata, and may exhibit the same behaviour - I need to rerun my test case onit tomorrow.Also, this really does make it impossible to write reliable programs.In the presence of multiple apps interacting on the same file, yes. I thinkthat's a little bit of a stretch though.For a single app, you can recover by remembering and redoing all the writesyou did.Sucks if your app wants to have multiple processes working together on afile without some kind of journal or WAL, relying on fsync() alone, mindyou. But at least we have WAL.Hrm. I wonder how this interacts with wal_level=minimal.Even leaving that aside, a PANIC means a prolonged outage on aprolonged system - it could easily take tens of minutes or longer torun recovery. So saying "oh, just do that" is not really an answer.Sure, we can do it, but it's like trying to lose weight byintentionally eating a tapeworm. Now, it's possible to shorten thecheckpoint_timeout so that recovery runs faster, but then performancedrops because data has to be fsync()'d more often instead of gettingbuffered in the OS cache for the maximum possible time.It's also spikier. Users have more issues with latency with short, frequentcheckpoints.We could alsododge this issue in another way: suppose that when we write a pageout, we don't consider it really written until fsync() succeeds. Thenwe wouldn't need to PANIC if an fsync() fails; we could just re-writethe page. Unfortunately, this would also be terrible for performance,for pretty much the same reasons: letting the OS cache absorb lots ofdirty blocks and do write-combining is necessary for goodperformance.Our double-caching is already plenty bad enough anyway, as well.(Ideally I want to be able to swap buffers between shared_buffers and theOS buffer-cache. Almost like a 2nd level of buffer pinning. When we writeout a block, we transfer ownership to the OS. Yeah, I'm dreaming. Butwe'd sure need to be able to trust the OS not to just forget the blockthen!)The error reporting is thus consistent with the intended semantics (whichare sadly not properly documented). Repeated calls to fsync() simply do notimply that the kernel will retry to writeback the previously-failed pages,so the application needs to be aware of that. Persisting the error at thefsync() level would essentially mean moving application policy into thekernel.I might accept this argument if I accepted that it was OK to decidethat an fsync() failure means you can forget that the write() everhappened in the first place, but it's hard to imagine an applicationthat wants that behavior. If the application didn't care aboutwhether the bytes really got to disk or not, it would not have calledfsync() in the first place. If it does care, reporting the error onlyonce is never an improvement.Many RDBMSes do just that. It's hardly behaviour unique to the kernel. Theyreport an ERROR on a statement in a txn then go on with life, merrilyforgetting that anything was ever wrong.I agree with PostgreSQL's stance that this is wrong. We require an explicitrollback (or ROLLBACK TO SAVEPOINT) to restore the session to a usablestate. This is good.But we're the odd one out there. Almost everyone else does much like whatfsync() does on Linux, report the error and forget it.In any case, we're not going to get anyone to backpatch a fix for this intoall kernels, so we're stuck working around it.I'll do some testing with ENOSPC tomorrow, propose a patch, report back.From:Greg Stark <stark(at)mit(dot)edu>Date:2018-04-03 14:37:30On 3 April 2018 at 14:36, Anthony Iliopoulos wrote:If EIO persists between invocations until explicitly cleared, a processcannot possibly make any decision as to if it should clear the errorI still don't understand what "clear the error" means here. The writesstill haven't been written out. We don't care about tracking errors,we just care whether all the writes to the file have been flushed todisk. By "clear the error" you mean throw away the dirty pages andrevert part of the file to some old data? Why would anyone ever wantthat?But instead of deconstructing and debating the semantics of thecurrent mechanism, why not come up with the ideal desired form oferror reporting/tracking granularity etc., and see how this may befitted into kernels as a new interface.Because Postgres is portable software that won't be able to use someLinux-specific interface. And doesn't really need any granular errorreporting system anyways. It just needs to know when all writes havebeen synced to disk.From:Anthony Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-04-03 16:52:07On Tue, Apr 03, 2018 at 03:37:30PM +0100, Greg Stark wrote:On 3 April 2018 at 14:36, Anthony Iliopoulos wrote:If EIO persists between invocations until explicitly cleared, a processcannot possibly make any decision as to if it should clear the errorI still don't understand what "clear the error" means here. The writesstill haven't been written out. We don't care about tracking errors,we just care whether all the writes to the file have been flushed todisk. By "clear the error" you mean throw away the dirty pages andrevert part of the file to some old data? Why would anyone ever wantthat?It means that the responsibility of recovering the data is passedback to the application. The writes may never be able to be writtenout. How would a kernel deal with that? Either discard the data(and have the writer acknowledge) or buffer the data until rebootand simply risk going OOM. It's not what someone would want, butrather need to deal with, one way or the other. At least on theapplication-level there's a fighting chance for restoring to aconsistent state. The kernel does not have that opportunity.But instead of deconstructing and debating the semantics of thecurrent mechanism, why not come up with the ideal desired form oferror reporting/tracking granularity etc., and see how this may befitted into kernels as a new interface.Because Postgres is portable software that won't be able to use someLinux-specific interface. And doesn't really need any granular errorI don't really follow this argument, Pg is admittedly using non-portableinterfaces (e.g the sync_file_range()). While it's nice to avoid platformspecific hacks, expecting that the POSIX semantics will be consistentacross systems is simply a 90's pipe dream. While it would be lovelyto have really consistent interfaces for application writers, this issimply not going to happen any time soon.And since those problematic semantics of fsync() appear to be prevalentin other systems as well that are not likely to be changed, you cannotrely on preconception that once buffers are handed over to kernel youhave a guarantee that they will be eventually persisted no matter what.(Why even bother having fsync() in that case? The kernel would eventuallyevict and writeback dirty pages anyway. The point of reporting the errorback to the application is to give it a chance to recover - the kernelcould repeat "fsync()" itself internally if this would solve anything).reporting system anyways. It just needs to know when all writes havebeen synced to disk.Well, it does know when some writes have not been synced to disk,exactly because the responsibility is passed back to the application.I do realize this puts more burden back to the application, but whatwould a viable alternative be? Would you rather have a kernel thatrisks periodically going OOM due to this design decision?From:Robert Haas <robertmhaas(at)gmail(dot)com>Date:2018-04-03 21:47:01On Tue, Apr 3, 2018 at 6:35 AM, Anthony Iliopoulos wrote:Like other people here, I think this is 100% unreasonable, startingwith "the dirty pages which cannot been written out are practicallythrown away". Who decided that was OK, and on the basis of whatwording in what specification? I think it's always unreasonable toIf you insist on strict conformance to POSIX, indeed the linuxglibc configuration and associated manpage are probably wrong instating that _POSIX_SYNCHRONIZED_IO is supported. The implementationmatches that of the flexibility allowed by not supporting SIO.There's a long history of brokenness between linux and posix,and I think there was never an intention of conforming to thestandard.Well, then the man page probably shouldn't say CONFORMING TO 4.3BSD,POSIX.1-2001, which on the first system I tested, it did. Also, thesummary should be changed from the current "fsync, fdatasync -synchronize a file's in-core state with storage device" by adding ",possibly by randomly undoing some of the changes you think you made tothe file".I believe (as tried to explain earlier) there is a certain assumptionbeing made that the writer and original owner of data is responsiblefor dealing with potential errors in order to avoid data loss (whichshould be only of interest to the original writer anyway). It wouldbe very questionable for the interface to persist the error whilesubsequent writes and fsyncs to different offsets may as well go through.No, that's not questionable at all. fsync() doesn't take any argumentsaying which part of the file you care about, so the kernel isentirely not entitled to assume it knows to which writes a givenfsync() call was intended to apply.Another process may need to write into the file and fsync, while beingunaware of those newly introduced semantics is now faced with EIObecause some unrelated previous process failed some earlier writesand did not bother to clear the error for those writes. In a similarscenario where the second process is aware of the new semantics, it wouldnaturally go ahead and clear the global error in order to proceedwith its own write()+fsync(), which would essentially amount to thesame problematic semantics you have now.I don't deny that it's possible that somebody could have anapplication which is utterly indifferent to the fact that earliermodifications to a file failed due to I/O errors, but is A-OK withthat as long as later modifications can be flushed to disk, but Idon't think that's a normal thing to want.Also, this really does make it impossible to write reliable programs.Imagine that, while the server is running, somebody runs a programwhich opens a file in the data directory, calls fsync() on it, andcloses it. If the fsync() fails, postgres is now borked and has noway of being aware of the problem. If we knew, we could PANIC, butwe'll never find out, because the unrelated process ate the error.This is exactly the sort of ill-considered behavior that makes fcntl()locking nearly useless.Fully agree, and the errseq_t fixes have dealt exactly with the issueof making sure that the error is reported to all file descriptors thathappen to be open at the time of error.Well, in PostgreSQL, we have a background process called thecheckpointer which is the process that normally does all of thefsync() calls but only a subset of the write() calls. Thecheckpointer does not, however, necessarily have every file open allthe time, so these fixes aren't sufficient to make sure that thecheckpointer ever sees an fsync() failure. What you have (or someonehas) basically done here is made an undocumented assumption aboutwhich file descriptors might care about a particular error, but itjust so happens that PostgreSQL has never conformed to thatassumption. You can keep on saying the problem is with ourassumptions, but it doesn't seem like a very good guess to me tosuppose that we're the only program that has ever made them. Thedocumentation for fsync() gives zero indication that it'sedge-triggered, and so complaining that people wouldn't like it if itbecame level-triggered seems like an ex post facto justification for apoorly-chosen behavior: they probably think (as we did prior to a weekago) that it already is.Not sure I understand this case. The application may indeed re-writea bunch of pages that have failed and proceed with fsync(). The kernelwill deal with combining the writeback of all the re-written pages. Butfurther the necessity of combining for performance really depends onthe exact storage medium. At the point you start caring aboutwrite-combining, the kernel community will naturally redirect you touse DIRECT_IO.Well, the way PostgreSQL works today, we typically run with say 8GB ofshared_buffers even if the system memory is, say, 200GB. As pages areevicted from our relatively small cache to the operating system, wetrack which files need to be fsync()'d at checkpoint time, but wedon't hold onto the blocks. Until checkpoint time, the operatingsystem is left to decide whether it's better to keep caching the dirtyblocks (thus leaving less memory for other things, but possiblyallowing write-combining if the blocks are written again) or whetherit should clean them to make room for other things. This means thatonly a small portion of the operating system memory is directlymanaged by PostgreSQL, while allowing the effective size of our cacheto balloon to some very large number if the system isn't under heavymemory pressure.Now, I hear the DIRECT_IO thing and I assume we're eventually going tohave to go that way: Linux kernel developers seem to think that "realmen use O_DIRECT" and so if other forms of I/O don't provide usefulguarantees, well that's our fault for not using O_DIRECT. That's apolitical reason, not a technical reason, but it's a reason all thesame.Unfortunately, that is going to add a huge amount of complexity,because if we ran with shared_buffers set to a large percentage ofsystem memory, we couldn't allocate large chunks of memory for sortsand hash tables from the operating system any more. We'd have toallocate it from our own shared_buffers because that's basically allthe memory there is and using substantially more might run the systemout entirely. So it's a huge, huge architectural change. And evenonce it's done it is in some ways inferior to what we are doing today-- true, it gives us superior control over writeback timing, but italso makes PostgreSQL play less nicely with other things running onthe same machine, because now PostgreSQL has a dedicated chunk ofwhatever size it has, rather than using some portion of the OS buffercache that can grow and shrink according to memory needs both of otherparts of PostgreSQL and other applications on the system.I suppose that if the check-and-clear semantics are problematic forPg, one could suggest a kernel patch that opts-in to a level-triggeredreporting of fsync() on a per-descriptor basis, which seems to benon-intrusive and probably sufficient to cover your expected use-case.That would certainly be better than nothing.From:Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>Date:2018-04-03 23:59:27On Tue, Apr 3, 2018 at 1:29 PM, Thomas Munro wrote:Interestingly, there don't seem to be many operating systems that canreport ENOSPC from fsync(), based on a quick scan through somedocumentation:POSIX, AIX, HP-UX, FreeBSD, OpenBSD, NetBSD: noIllumos/Solaris, Linux, macOS: yesOops, reading comprehension fail. POSIX yes (since issue 5), via thenote that read() and write()'s error conditions can also be returned.From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-04 00:56:37On Tue, Apr 3, 2018 at 05:47:01PM -0400, Robert Haas wrote:Well, in PostgreSQL, we have a background process called thecheckpointer which is the process that normally does all of thefsync() calls but only a subset of the write() calls. Thecheckpointer does not, however, necessarily have every file open allthe time, so these fixes aren't sufficient to make sure that thecheckpointer ever sees an fsync() failure.There has been a lot of focus in this thread on the workflow:write() -> blocks remain in kernel memory -> fsync() -> panic?But what happens in this workflow:write() -> kernel syncs blocks to storage -> fsync()Is fsync() going to see a "kernel syncs blocks to storage" failure?There was already discussion that if the fsync() causes the "syncsblocks to storage", fsync() will only report the failure once, but willit see any failure in the second workflow? There is indication that afailed write to storage reports back an error once and clears the dirtyflag, but do we know it keeps things around long enough to report anerror to a future fsync()?You would think it does, but I have to ask since our fsync() assumptionshave been wrong for so long.From:Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>Date:2018-04-04 01:54:50On Wed, Apr 4, 2018 at 12:56 PM, Bruce Momjian wrote:There has been a lot of focus in this thread on the workflow: write() -> blocks remain in kernel memory -> fsync() -> panic?But what happens in this workflow: write() -> kernel syncs blocks to storage -> fsync()Is fsync() going to see a "kernel syncs blocks to storage" failure?There was already discussion that if the fsync() causes the "syncsblocks to storage", fsync() will only report the failure once, but willit see any failure in the second workflow? There is indication that afailed write to storage reports back an error once and clears the dirtyflag, but do we know it keeps things around long enough to report anerror to a future fsync()?You would think it does, but I have to ask since our fsync() assumptionshave been wrong for so long.I believe there were some problems of that nature (with varioustwists, based on other concurrent activity and possibly differentfds), and those problems were fixed by the errseq_t system developedby Jeff Layton in Linux 4.13. Call that "bug #1".The second issues is that the pages are marked clean after the erroris reported, so further attempts to fsync() the data (in our case fora new attempt to checkpoint) will be futile but appear successful.Call that "bug #2", with the proviso that some people apparently thinkit's reasonable behaviour and not a bug. At least there is aplausible workaround for that: namely the nuclear option proposed byCraig.From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-04 02:05:19On Wed, Apr 4, 2018 at 01:54:50PM +1200, Thomas Munro wrote:On Wed, Apr 4, 2018 at 12:56 PM, Bruce Momjian wrote:There has been a lot of focus in this thread on the workflow: write() -> blocks remain in kernel memory -> fsync() -> panic?But what happens in this workflow: write() -> kernel syncs blocks to storage -> fsync()Is fsync() going to see a "kernel syncs blocks to storage" failure?There was already discussion that if the fsync() causes the "syncsblocks to storage", fsync() will only report the failure once, but willit see any failure in the second workflow? There is indication that afailed write to storage reports back an error once and clears the dirtyflag, but do we know it keeps things around long enough to report anerror to a future fsync()?You would think it does, but I have to ask since our fsync() assumptionshave been wrong for so long.I believe there were some problems of that nature (with varioustwists, based on other concurrent activity and possibly differentfds), and those problems were fixed by the errseq_t system developedby Jeff Layton in Linux 4.13. Call that "bug #1".So all our non-cutting-edge Linux systems are vulnerable and there is noworkaround Postgres can implement? Wow.The second issues is that the pages are marked clean after the erroris reported, so further attempts to fsync() the data (in our case fora new attempt to checkpoint) will be futile but appear successful.Call that "bug #2", with the proviso that some people apparently thinkit's reasonable behaviour and not a bug. At least there is aplausible workaround for that: namely the nuclear option proposed byCraig.Yes, that one I understood.From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-04 02:14:28On Tue, Apr 3, 2018 at 10:05:19PM -0400, Bruce Momjian wrote:On Wed, Apr 4, 2018 at 01:54:50PM +1200, Thomas Munro wrote:I believe there were some problems of that nature (with varioustwists, based on other concurrent activity and possibly differentfds), and those problems were fixed by the errseq_t system developedby Jeff Layton in Linux 4.13. Call that "bug #1".So all our non-cutting-edge Linux systems are vulnerable and there is noworkaround Postgres can implement? Wow.Uh, are you sure it fixes our use-case? From the email description itsounded like it only reported fsync errors for every open filedescriptor at the time of the failure, but the checkpoint process mightopen the file after the failure and try to fsync a write that happenedbefore the failure.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-04 02:40:16On 4 April 2018 at 05:47, Robert Haas wrote:Now, I hear the DIRECT_IO thing and I assume we're eventually going tohave to go that way: Linux kernel developers seem to think that "realmen use O_DIRECT" and so if other forms of I/O don't provide usefulguarantees, well that's our fault for not using O_DIRECT. That's apolitical reason, not a technical reason, but it's a reason all thesame.I looked into buffered AIO a while ago, by the way, and just ... hell no.Run, run as fast as you can.The trouble with direct I/O is that it pushes a lot of work back onPostgreSQL regarding knowledge of the storage subsystem, I/O scheduling,etc. It's absurd to have the kernel do this, unless you want it reliable,in which case you bypass it and drive the hardware directly.We'd need pools of writer threads to deal with all the blocking I/O. It'dbe such a nightmare. Hey, why bother having a kernel at all, except fordrivers?From:Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>Date:2018-04-04 02:44:22On Wed, Apr 4, 2018 at 2:14 PM, Bruce Momjian wrote:On Tue, Apr 3, 2018 at 10:05:19PM -0400, Bruce Momjian wrote:On Wed, Apr 4, 2018 at 01:54:50PM +1200, Thomas Munro wrote:I believe there were some problems of that nature (with varioustwists, based on other concurrent activity and possibly differentfds), and those problems were fixed by the errseq_t system developedby Jeff Layton in Linux 4.13. Call that "bug #1".So all our non-cutting-edge Linux systems are vulnerable and there is noworkaround Postgres can implement? Wow.Uh, are you sure it fixes our use-case? From the email description itsounded like it only reported fsync errors for every open filedescriptor at the time of the failure, but the checkpoint process mightopen the file after the failure and try to fsync a write that happenedbefore the failure.I'm not sure of anything. I can see that it's designed to reporterrors since the last fsync() of the file (presumably via any fd),which sounds like the desired behaviour:https://github.com/torvalds/linux/blob/master/mm/filemap.c#L682When userland calls fsync (or something like nfsd does the equivalent), wewant to report any writeback errors that occurred since the last fsync (orsince the file was opened if there haven't been any).But I'm not sure what the lifetime of the passed-in "file" and moreimportantly "file->f_wb_err" is. Specifically, what happens to it ifno one has the file open at all, between operations? It is referencecounted, see fs/file_table.c. I don't know enough about it tocomment.From:Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>Date:2018-04-04 05:29:28On Wed, Apr 4, 2018 at 2:44 PM, Thomas Munro wrote:On Wed, Apr 4, 2018 at 2:14 PM, Bruce Momjian wrote:Uh, are you sure it fixes our use-case? From the email description itsounded like it only reported fsync errors for every open filedescriptor at the time of the failure, but the checkpoint process mightopen the file after the failure and try to fsync a write that happenedbefore the failure.I'm not sure of anything. I can see that it's designed to reporterrors since the last fsync() of the file (presumably via any fd),which sounds like the desired behaviour:[..]Scratch that. Whenever you open a file descriptor you can't see anypreceding errors at all, because:/* Ensure that we skip any errors that predate opening of the file */f->f_wb_err = filemap_sample_wb_err(f->f_mapping);https://github.com/torvalds/linux/blob/master/fs/open.c#L752Our whole design is based on being able to open, close and reopenfiles at will from any process, and in particular to fsync() from adifferent process that didn't inherit the fd but instead opened itlater. But it looks like that might be able to eat errors thatoccurred during asynchronous writeback (when there was nobody toreport them to), before you opened the file?If so I'm not sure how that can possibly be considered to be animplementation of _POSIX_SYNCHRONIZED_IO: "the fsync() function shallforce all currently queued I/O operations associated with the fileindicated by file descriptor fildes to the synchronized I/O completionstate." Note "the file", not "this file descriptor + copies", andwithout reference to when you opened it.But I'm not sure what the lifetime of the passed-in "file" and moreimportantly "file->f_wb_err" is.It's really inode->i_mapping->wb_err's lifetime that I should havebeen asking about there, not file->f_wb_err, but I see now that thatquestion is irrelevant due to the above.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-04 06:00:21On 4 April 2018 at 13:29, Thomas Munro wrote:On Wed, Apr 4, 2018 at 2:44 PM, Thomas Munro wrote:On Wed, Apr 4, 2018 at 2:14 PM, Bruce Momjian wrote:Uh, are you sure it fixes our use-case? From the email description itsounded like it only reported fsync errors for every open filedescriptor at the time of the failure, but the checkpoint process mightopen the file after the failure and try to fsync a write that happenedbefore the failure.I'm not sure of anything. I can see that it's designed to reporterrors since the last fsync() of the file (presumably via any fd),which sounds like the desired behaviour:[..]Scratch that. Whenever you open a file descriptor you can't see anypreceding errors at all, because:/* Ensure that we skip any errors that predate opening of the file */f->f_wb_err = filemap_sample_wb_err(f->f_mapping);https://github.com/torvalds/linux/blob/master/fs/open.c#L752Our whole design is based on being able to open, close and reopenfiles at will from any process, and in particular to fsync() from adifferent process that didn't inherit the fd but instead opened itlater. But it looks like that might be able to eat errors thatoccurred during asynchronous writeback (when there was nobody toreport them to), before you opened the file?Holy hell. So even PANICing on fsync() isn't sufficient, because the kernelwill deliberately hide writeback errors that predate our fsync() call fromus?I'll see if I can expand my testcase for that. I'm presently dockerizing itto make it easier for others to use, but that turns out to be a major painwhen using devmapper etc. Docker in privileged mode doesn't seem to playnice with device-mapper.Does that mean that the ONLY ways to do reliable I/O are:single-process, single-file-descriptor write() then fsync(); on failure,retry all work since last successful fsync()direct I/O?From:Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>Date:2018-04-04 07:32:04On Wed, Apr 4, 2018 at 6:00 PM, Craig Ringer wrote:On 4 April 2018 at 13:29, Thomas Munro wrote:/* Ensure that we skip any errors that predate opening of the file */f->f_wb_err = filemap_sample_wb_err(f->f_mapping);[...]Holy hell. So even PANICing on fsync() isn't sufficient, because the kernelwill deliberately hide writeback errors that predate our fsync() call fromus?Predates the opening of the file by the process that calls fsync().Yeah, it sure looks that way based on the above code fragment. Doesanyone know better?Does that mean that the ONLY ways to do reliable I/O are:single-process, single-file-descriptor write() then fsync(); on failure,retry all work since last successful fsync()I suppose you could some up with some crazy complicated IPC scheme tomake sure that the checkpointer always has an fd older than any writesto be flushed, with some fallback strategy for when it can't take anymore fds.I haven't got any good ideas right now.direct I/OAs a bit of an aside, I gather that when you resize files (thinktruncating/extending relation files) you still need to call fsync()even if you read/write all data with O_DIRECT, to make it flush thefilesystem meta-data. I have no idea if that could also be affectedby eaten writeback errors.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-04 07:51:53On 4 April 2018 at 14:00, Craig Ringer wrote:On 4 April 2018 at 13:29, Thomas Munro wrote:On Wed, Apr 4, 2018 at 2:44 PM, Thomas Munro wrote:On Wed, Apr 4, 2018 at 2:14 PM, Bruce Momjian wrote:Uh, are you sure it fixes our use-case? From the email description itsounded like it only reported fsync errors for every open filedescriptor at the time of the failure, but the checkpoint process mightopen the file after the failure and try to fsync a write that happenedbefore the failure.I'm not sure of anything. I can see that it's designed to reporterrors since the last fsync() of the file (presumably via any fd),which sounds like the desired behaviour:[..]Scratch that. Whenever you open a file descriptor you can't see anypreceding errors at all, because:/* Ensure that we skip any errors that predate opening of the file */f->f_wb_err = filemap_sample_wb_err(f->f_mapping);https://github.com/torvalds/linux/blob/master/fs/open.c#L752Our whole design is based on being able to open, close and reopenfiles at will from any process, and in particular to fsync() from adifferent process that didn't inherit the fd but instead opened itlater. But it looks like that might be able to eat errors thatoccurred during asynchronous writeback (when there was nobody toreport them to), before you opened the file?Holy hell. So even PANICing on fsync() isn't sufficient, because thekernel will deliberately hide writeback errors that predate our fsync()call from us?I'll see if I can expand my testcase for that. I'm presently dockerizingit to make it easier for others to use, but that turns out to be a majorpain when using devmapper etc. Docker in privileged mode doesn't seem toplay nice with device-mapper.Done, you can find it inhttps://github.com/ringerc/scrapcode/tree/master/testcases/fsync-error-clearnow.Warning, this runs a Docker container in privileged mode on your system,and it uses devicemapper. Read it before you run it, and while I've triedto keep it safe, beware that it might eat your system.For now it tests only xfs and EIO. Other FSs should be easy enough.I haven't added coverage for multi-processing yet, but given what you foundabove, I should. I'll probably just system() a copy of the same proc withinstructions to only fsync(). I'll do that next.I haven't worked out a reliable way to trigger ENOSPC on fsync() yet, whenmapping without the error hole. It happens sometimes but I don't know why,it almost always happens on write() instead. I know it can happen on nfs,but I'm hoping for a saner example than that to test with. ext4 and xfs dodelayed allocation but eager reservation so it shouldn't happen to them.From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-04 13:49:38On Wed, Apr 4, 2018 at 07:32:04PM +1200, Thomas Munro wrote:On Wed, Apr 4, 2018 at 6:00 PM, Craig Ringer wrote:On 4 April 2018 at 13:29, Thomas Munro wrote:/* Ensure that we skip any errors that predate opening of the file */f->f_wb_err = filemap_sample_wb_err(f->f_mapping);[...]Holy hell. So even PANICing on fsync() isn't sufficient, because the kernelwill deliberately hide writeback errors that predate our fsync() call fromus?Predates the opening of the file by the process that calls fsync().Yeah, it sure looks that way based on the above code fragment. Doesanyone know better?Uh, just to clarify, what is new here is that it is ignoring anyerrors that happened before the open(). It is not ignoring write()'sthat happened but have not been written to storage before the open().FYI, pg_test_fsync has always tested the ability to fsync() writes()from from other processes:Test if fsync on non-write file descriptor is honored:(If the times are similar, fsync() can sync data written on a differentdescriptor.) write, fsync, close 5360.341 ops/sec 187 usecs/op write, close, fsync 4785.240 ops/sec 209 usecs/opThose two numbers should be similar. I added this as a check to makesure the behavior we were relying on was working. I never tested syncerrors though.I think the fundamental issue is that we always assumed that writes tothe kernel that could not be written to storage would remain in thekernel until they succeeded, and that fsync() would report theirexistence.I can understand why kernel developers don't want to keep failed syncbuffers in memory, and once they are gone we lose reporting of theirfailure. Also, if the kernel is going to not retry the syncs, how longshould it keep reporting the sync failure? To the first fsync thathappens after the failure? How long should it continue to record thefailure? What if no fsync() every happens, which is likely fornon-Postgres workloads? I think once they decided to discard failedsyncs and not retry them, the fsync behavior we are complaining aboutwas almost required.Our only option might be to tell administrators to closely watch forkernel write failure messages, and then restore or failover. :-(The last time I remember being this surprised about storage was in theearly Postgres years when we learned that just because the BSD filesystem uses 8k pages doesn't mean those are atomically written tostorage. We knew the operating system wrote the data in 8k chunks tostorage but:the 8k pages are written as separate 512-byte sectorsthe 8k might be contiguous logically on the drive but not physicallyeven 512-byte sectors are not written atomicallyThis is why we added pre-page images are written to WAL, which is whatfull_page_writes controls.From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-04 13:53:01On Wed, Apr 4, 2018 at 10:40:16AM +0800, Craig Ringer wrote:The trouble with direct I/O is that it pushes a lot of work back onPostgreSQL regarding knowledge of the storage subsystem, I/O scheduling, etc.It's absurd to have the kernel do this, unless you want it reliable, in whichcase you bypass it and drive the hardware directly.We'd need pools of writer threads to deal with all the blocking I/O. It'd besuch a nightmare. Hey, why bother having a kernel at all, except for drivers?I believe this is how Oracle views the kernel, so there is precedent forthis approach, though I am not advocating it.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-04 14:00:15On 4 April 2018 at 15:51, Craig Ringer wrote:On 4 April 2018 at 14:00, Craig Ringer wrote:On 4 April 2018 at 13:29, Thomas Munro wrote:On Wed, Apr 4, 2018 at 2:44 PM, Thomas Munro wrote:On Wed, Apr 4, 2018 at 2:14 PM, Bruce Momjian wrote:Uh, are you sure it fixes our use-case? From the email description itsounded like it only reported fsync errors for every open filedescriptor at the time of the failure, but the checkpoint process mightopen the file after the failure and try to fsync a write that happenedbefore the failure.I'm not sure of anything. I can see that it's designed to reporterrors since the last fsync() of the file (presumably via any fd),which sounds like the desired behaviour:[..]Scratch that. Whenever you open a file descriptor you can't see anypreceding errors at all, because:/* Ensure that we skip any errors that predate opening of the file */f->f_wb_err = filemap_sample_wb_err(f->f_mapping);https://github.com/torvalds/linux/blob/master/fs/open.c#L752Our whole design is based on being able to open, close and reopenfiles at will from any process, and in particular to fsync() from adifferent process that didn't inherit the fd but instead opened itlater. But it looks like that might be able to eat errors thatoccurred during asynchronous writeback (when there was nobody toreport them to), before you opened the file?Holy hell. So even PANICing on fsync() isn't sufficient, because thekernel will deliberately hide writeback errors that predate our fsync()call from us?I'll see if I can expand my testcase for that. I'm presently dockerizingit to make it easier for others to use, but that turns out to be a majorpain when using devmapper etc. Docker in privileged mode doesn't seem toplay nice with device-mapper.Done, you can find it in https://github.com/ringerc/scrapcode/tree/master/testcases/fsync-error-clear now.Update. Now supports multiple FSes.I've tried xfs, jfs, ext3, ext4, even vfat. All behave the same on EIO.Didn't try zfs-on-linux or other platforms yet.Still working on getting ENOSPC on fsync() rather than write(). Kernel codereading suggests this is possible, but all the above FSes reserve spaceeagerly on write( ) even if they do delayed allocation of the actualstorage, so it doesn't seem to happen at least in my simple single-processtest.I'm not overly inclined to complain about a fsync() succeeding after awrite() error. That seems reasonable enough, the kernel told the app at thetime of the failure. What else is it going to do? I don't personally evenobject hugely to the current fsync() behaviour if it were, say, DOCUMENTEDand conformant to the relevant standards, though not giving us any sane wayto find out the affected file ranges makes it drastically harder to recoversensibly.But what's come out since on this thread, that we cannot even rely onfsync() giving us an EIO once when it loses our data, because:all currently widely deployed kernels can fail to deliver info due torecently fixed limitation; andthe kernel deliberately hides errors from us if they relate to writesthat occurred before we opened the FD (?)... that's really troubling. I thought we could at least fix this byPANICing on EIO, and was mostly worried about ENOSPC. But now it seems wecan't even do that and expect reliability. So how the @#$ are we meant todo?It's the error reporting issues around closing and reopening files withoutstanding buffered I/O that's really going to hurt us here. I'll beexpanding my test case to cover that shortly.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-04 14:09:09On 4 April 2018 at 22:00, Craig Ringer wrote:It's the error reporting issues around closing and reopening files withoutstanding buffered I/O that's really going to hurt us here. I'll beexpanding my test case to cover that shortly.Also, just to be clear, this is not in any way confined to xfs and/or lvmas I originally thought it might be.Nor is ext3/ext4's errors=remount-ro protective. data_err=abort doesn'thelp either (so what does it do?).What bewilders me is that running with data=journal doesn't seem to be safeeither. WTF?[26438.846111] EXT4-fs (dm-0): mounted filesystem with journalled datamode. Opts: errors=remount-ro,data_err=abort,data=journal[26454.125319] EXT4-fs warning (device dm-0): ext4_end_bio:323: I/O error10 writing to inode 12 (offset 0 size 0 starting block 59393)[26454.125326] Buffer I/O error on device dm-0, logical block 59393[26454.125337] Buffer I/O error on device dm-0, logical block 59394[26454.125343] Buffer I/O error on device dm-0, logical block 59395[26454.125350] Buffer I/O error on device dm-0, logical block 59396and splat, there goes your data anyway.It's possible that this is in some way related to using the device-mapper"error" target and a loopback device in testing. But I don't really see how.From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-04 14:25:47On Wed, Apr 4, 2018 at 10:09:09PM +0800, Craig Ringer wrote:On 4 April 2018 at 22:00, Craig Ringer wrote:It's the error reporting issues around closing and reopening files withoutstanding buffered I/O that's really going to hurt us here. I'll beexpanding my test case to cover that shortly.Also, just to be clear, this is not in any way confined to xfs and/or lvm as Ioriginally thought it might be.Nor is ext3/ext4's errors=remount-ro protective. data_err=abort doesn't helpeither (so what does it do?).Anthony Iliopoulos reported in this thread that errors=remount-ro isonly affected by metadata writes.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-04 14:42:18On 4 April 2018 at 22:25, Bruce Momjian wrote:On Wed, Apr 4, 2018 at 10:09:09PM +0800, Craig Ringer wrote:On 4 April 2018 at 22:00, Craig Ringer wrote:It's the error reporting issues around closing and reopening files withoutstanding buffered I/O that's really going to hurt us here. I'll beexpanding my test case to cover that shortly.Also, just to be clear, this is not in any way confined to xfs and/or lvm as Ioriginally thought it might be.Nor is ext3/ext4's errors=remount-ro protective. data_err=abort doesn't helpeither (so what does it do?).Anthony Iliopoulos reported in this thread that errors=remount-ro isonly affected by metadata writes.Yep, I gathered. I was referring to data_err.From:Antonis Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-04-04 15:23:31On Wed, Apr 4, 2018 at 4:42 PM, Craig Ringer wrote:On 4 April 2018 at 22:25, Bruce Momjian wrote:On Wed, Apr 4, 2018 at 10:09:09PM +0800, Craig Ringer wrote:On 4 April 2018 at 22:00, Craig Ringer wrote:It's the error reporting issues around closing and reopening files withoutstanding buffered I/O that's really going to hurt us here. I'll beexpanding my test case to cover that shortly.Also, just to be clear, this is not in any way confined to xfs and/orlvm as I originally thought it might be.Nor is ext3/ext4's errors=remount-ro protective. data_err=abortdoesn't help either (so what does it do?).Anthony Iliopoulos reported in this thread that errors=remount-ro isonly affected by metadata writes.Yep, I gathered. I was referring to data_err.As far as I recall data_err=abort pertains to the jbd2 handling ofpotential writeback errors. Jbd2 will inetrnally attempt to drainthe data upon txn commit (and it's even kind enough to restorethe EIO at the address space level, that otherwise would get eaten).When data_err=abort is set, then jbd2 forcibly shuts down theentire journal, with the error being propagated upwards to ext4.I am not sure at which point this would be manifested to userspaceand how, but in principle any subsequent fs operations would getsome filesystem error due to the journal being down (I wouldassume similar to remounting the fs read-only).Since you are using data=journal, I would indeed expect to seesomething more than what you saw in dmesg.I can have a look later, I plan to also respond to some of the otherinteresting issues that you guys raised in the thread.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-04 15:23:51On 4 April 2018 at 21:49, Bruce Momjian wrote:On Wed, Apr 4, 2018 at 07:32:04PM +1200, Thomas Munro wrote:On Wed, Apr 4, 2018 at 6:00 PM, Craig Ringer wrote:On 4 April 2018 at 13:29, Thomas Munro wrote:/* Ensure that we skip any errors that predate opening of the file */f->f_wb_err = filemap_sample_wb_err(f->f_mapping);[...]Holy hell. So even PANICing on fsync() isn't sufficient, because thekernelwill deliberately hide writeback errors that predate our fsync() callfromus?Predates the opening of the file by the process that calls fsync().Yeah, it sure looks that way based on the above code fragment. Doesanyone know better?Uh, just to clarify, what is new here is that it is ignoring anyerrors that happened before the open(). It is not ignoring write()'sthat happened but have not been written to storage before the open().FYI, pg_test_fsync has always tested the ability to fsync() writes()from from other processes: Test if fsync on non-write file descriptor is honored: (If the times are similar, fsync() can sync data written on adifferent descriptor.) write, fsync, close 5360.341 ops/sec 187 usecs/op write, close, fsync 4785.240 ops/sec 209 usecs/opThose two numbers should be similar. I added this as a check to makesure the behavior we were relying on was working. I never tested syncerrors though.I think the fundamental issue is that we always assumed that writes tothe kernel that could not be written to storage would remain in thekernel until they succeeded, and that fsync() would report theirexistence.I can understand why kernel developers don't want to keep failed syncbuffers in memory, and once they are gone we lose reporting of theirfailure. Also, if the kernel is going to not retry the syncs, how longshould it keep reporting the sync failure?Ideally until the app tells it not to.But there's no standard API for that.The obvious answer seems to be "until the FD is closed". But we justdiscussed how Pg relies on being able to open and close files freely. Thatmay not be as reasonable a thing to do as we thought it was when youconsider error reporting. What's the kernel meant to do? How long should itremember "I had an error while doing writeback on this file"? Should itflag the file metadata and remember across reboots? Obviously not, butwhere does it stop? Tell the next program that does an fsync() and forget?How could it associate a dirty buffer on a file with no open FDs with anyparticular program at all? And what if the app did a write then closed thefile and went away, never to bother to check the file again, like most appsdo?Some I/O errors are transient (network issue, etc). Some are recoverablewith some sort of action, like disk space issues, but may take a long timebefore an admin steps in. Some are entirely unrecoverable (disk 1 instriped array is on fire) and there's no possible recovery. Currently wekind of hope the kernel will deal with figuring out which is which andretrying. Turns out it doesn't do that so much, and I don't think thereasons for that are wholly unreasonable. We may have been asking too much.That does leave us in a pickle when it comes to the checkpointer andopening/closing FDs. I don't know what the "right" thing for the kernel todo from our perspective even is here, but the best I can come up with isactually pretty close to what it does now. Report the fsync() error to thefirst process that does an fsync() since the writeback error if one hasoccurred, then forget about it. Ideally I'd have liked it to mark all FDspointing to the file with a flag to report EIO on next fsync too, but itturns out that won't even help us due to our opening and closing behaviour,so we're going to have to take responsibility for handling andcommunicating that ourselves, preventing checkpoint completion if anybackend gets an fsync error. Probably by PANICing. Some extra work may beneeded to ensure reliable ordering and stop checkpoints completing if theirfsync() succeeds due to a recent failed fsync() on a normal backend thathasn't PANICed or where the postmaster hasn't noticed yet.Our only option might be to tell administrators to closely watch for> kernel write failure messages, and then restore or failover. :-(>Speaking of, there's not necessarily any lost page write error in the logsAFAICS. My tests often just show "Buffer I/O error on device dm-0, logicalblock 59393" or the like.From:Gasper Zejn <zejn(at)owca(dot)info>Date:2018-04-04 17:23:58On 04. 04. 2018 15:49, Bruce Momjian wrote:I can understand why kernel developers don't want to keep failed syncbuffers in memory, and once they are gone we lose reporting of theirfailure. Also, if the kernel is going to not retry the syncs, how longshould it keep reporting the sync failure? To the first fsync thathappens after the failure? How long should it continue to record thefailure? What if no fsync() every happens, which is likely fornon-Postgres workloads? I think once they decided to discard failedsyncs and not retry them, the fsync behavior we are complaining aboutwas almost required.Ideally the kernel would keep its data for as little time as possible.With fsync, it doesn't really know which process is interested inknowing about a write error, it just assumes the caller will know how todeal with it. Most unfortunate issue is there's no way to getinformation about a write error.Thinking aloud - couldn't/shouldn't a write error also be a file systemevent reported by inotify? Admittedly that's only a thing on Linux, butstill.From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-04 17:51:03On Wed, Apr 4, 2018 at 11:23:51PM +0800, Craig Ringer wrote:On 4 April 2018 at 21:49, Bruce Momjian wrote:I can understand why kernel developers don't want to keep failed sync buffers in memory, and once they are gone we lose reporting of their failure. Also, if the kernel is going to not retry the syncs, how long should it keep reporting the sync failure?Ideally until the app tells it not to.But there's no standard API for that.You would almost need an API that registers before the failure thatyou care about sync failures, and that you plan to call fsync() togather such information. I am not sure how you would allow more thanthe first fsync() to see the failure unless you added another API toclear the fsync failure, but I don't see the point since the firstfsync() might call that clear function. How many applications are goingto know there is another application that cares about the failure? Notmany.Currently we kind of hope the kernel will deal with figuring out whichis which and retrying. Turns out it doesn't do that so much, and Idon't think the reasons for that are wholly unreasonable. We may havebeen asking too much.Agreed.Our only option might be to tell administrators to closely watch for kernel write failure messages, and then restore or failover. :-(Speaking of, there's not necessarily any lost page write error in the logsAFAICS. My tests often just show "Buffer I/O error on device dm-0, logicalblock 59393" or the like.I assume that is the kernel logs. I am thinking the kernel logs have tobe monitored, but how many administrators do that? The other issue Ithink you are pointing out is how is the administrator going to knowthis is a Postgres file? I guess any sync error to a device thatcontains Postgres has to assume Postgres is corrupted. :-(see explicit treatment of retrying, though I'm not entirely sure ifthe retry flag is set just for async write-back), and apparentlyunlike every other kernel I've tried to grok so far (things descendedfrom ancestral BSD but not descended from FreeBSD, with macOS/Darwinapparently in the first category for this purpose).Here's a new ticket in the NetBSD bug database for this stuff:http://gnats.netbsd.org/53152As mentioned in that ticket and by Andres earlier in this thread,keeping the page dirty isn't the only strategy that would work and maybe problematic in different ways (it tells the truth but floods yourcache with unflushable stuff until eventually you force unmount it andyour buffers are eventually invalidated after ENXIO errors? I don'tknow.). I have no qualified opinion on that. I just know that weneed a way for fsync() to tell the truth about all preceding writes orour checkpoints are busted.*We mmap() + msync() in pg_flush_data() if you don't havesync_file_range(), and I see now that that is probably not a greatidea on ZFS because you'll finish up double-buffering (or is thattriple-buffering?), flooding your page cache with transient data.Oops. That is off-topic and not relevant for the checkpointcorrectness topic of this thread through, since pg_flush_data() isadvisory only.From:Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>Date:2018-04-04 22:14:24On Thu, Apr 5, 2018 at 9:28 AM, Thomas Munro wrote:On Thu, Apr 5, 2018 at 2:00 AM, Craig Ringer wrote:I've tried xfs, jfs, ext3, ext4, even vfat. All behave the same on EIO.Didn't try zfs-on-linux or other platforms yet.While contemplating what exactly it would do (not sure),See manual for failmode=wait | continue | panic. Even "continue"returns EIO to all new write requests, so they apparently didn'tbother to supply an 'eat-my-data-but-tell-me-everything-is-fine' mode.Figures.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-05 07:09:57Summary to date:It's worse than I thought originally, because:Most widely deployed kernels have cases where they don't tell you about losing your writes at all; andInformation about loss of writes can be masked by closing and re-opening a fileSo the checkpointer cannot trust that a successful fsync() means ... a successful fsync().Also, it's been reported to me off-list that anyone on the system callingsync(2) or the sync shell command will also generally consume the writeerror, causing us not to see it when we fsync(). The same is truefor /proc/sys/vm/drop_caches. I have not tested these yet.There's some level of agreement that we should PANIC on fsync() errors, atleast on Linux, but likely everywhere. But we also now know it'sinsufficient to be fully protective.I previously though that errors=remount-ro was a sufficient safeguard. Itisn't. There doesn't seem to be anything that is, for ext3, ext4, btrfs orxfs.It's not clear to me yet why data_err=abort isn't sufficient indata=ordered or data=writeback mode on ext3 or ext4, needs more digging.(In my test tools that's: make FSTYPE=ext4 MKFSOPTS="" MOUNTOPTS="errors=remount-ro,data_err=abort,data=journal"as of the current version d7fe802ec). AFAICS that's becausedata_error=abort only affects data=ordered, not data=journal. If you usedata=ordered, you at least get retries of the same write failing. This posthttps://lkml.org/lkml/2008/10/10/80 added the option and has someexplanation, but doesn't explain why it doesn't affect data=journal.zfs is probably not affected by the issues, per Thomas Munro. I haven't runmy test scripts on it yet because my kernel doesn't have zfs support andI'm prioritising the multi-process / open-and-close issues.So far none of the FSes and options I've tried exhibit the behavour Iactually want, which is to make the fs readonly or inaccessible on I/Oerror.ENOSPC doesn't seem to be a concern during normal operation of major filesystems (ext3, ext4, btrfs, xfs) because they reserve space beforereturning from write(). But if a buffered write does manage to fail due toENOSPC we'll definitely see the same problems. This makes ENOSPC on NFS apotentially data corrupting condition since NFS doesn't preallocate spacebefore returning from write().I think what we really need is a block-layer fix, where an I/O error flipsthe block device into read-only mode, as if blockdev --setro hadbeen used. Though I'd settle for a kernel panic, frankly. I don't thinkanybody really wants this, but I'd rather either of those to silent dataloss.I'm currently tweaking my test to do some close and reopen the file betweeneach write() and fsync(), and to support running with nfs.I've also just found the device-mapper "flakey" driver, which looksfantastic for simulating unreliable I/O with intermittent faults. I've beenusing the "error" target in a mapping, which lets me remap some of thedevice to always error, but "flakey" looks very handy for actual PostgreSQLtesting.For the sake of Google, these are errors known to be associated with theproblem:ext4, and ext3 mounted with ext4 driver:[42084.327345] EXT4-fs warning (device dm-0): ext4_end_bio:323: I/O error10 writing to inode 12 (offset 0 size 0 starting block 59393)[42084.327352] Buffer I/O error on device dm-0, logical block 59393xfs:[42193.771367] XFS (dm-0): writeback error on sector 118784[42193.784477] XFS (dm-0): writeback error on sector 118784jfs: (nil, silence in the kernel logs)You should also beware of "lost page write" or "lost write" errors.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-05 08:46:08On 5 April 2018 at 15:09, Craig Ringer wrote:Also, it's been reported to me off-list that anyone on the system callingsync(2) or the sync shell command will also generally consume the writeerror, causing us not to see it when we fsync(). The same is truefor /proc/sys/vm/drop_caches. I have not tested these yet.I just confirmed this with a tweak to the test thatrecords the file positionclose()s the fdsync()sopen(s) the filelseek()s back to the recorded positionThis causes the test to completely ignore the I/O error, which is notreported to it at any time.Fair enough, really, when you look at it from the kernel's point of view.What else can it do? Nobody has the file open. It'd have to mark the fileits self as bad somehow. But that's pretty bad for our robustness AFAICS.There's some level of agreement that we should PANIC on fsync() errors, atleast on Linux, but likely everywhere. But we also now know it'sinsufficient to be fully protective.If dirty writeback fails between our close() and re-open() I see the samebehaviour as with sync(). To test that I set dirty_writeback_centisecsand dirty_expire_centisecs to 1 and added a usleep(3*100*1000) betweenclose() and open(). (It's still plenty slow). So sync() is a convenient wayto simulate something other than our own fsync() writing out the dirtybuffer.If I omit the sync() then we get the error reported by fsync() once when were open() the file and fsync() it, because the buffers weren't written outyet, so the error wasn't generated until we re-open()ed the file. But Idoubt that'll happen much in practice because dirty writeback will get toit first so the error will be seen and discarded before we reopen the filein the checkpointer.In other words, it looks like even with a new kernel with the errorreporting bug fixes, if I understand how the backends and checkpointerinteract when it comes to file descriptors, we're unlikely to notice I/Oerrors and fail a checkpoint. We may notice I/O errors if a backend doesits own eager writeback for large I/O operations, or if the checkpointerfsync()s a file before the kernel's dirty writeback gets around to tryingto flush the pages that will fail.I haven't tested anything with multiple processes / multiple FDs yet, wherewe keep one fd open while writing on another.But at this point I don't see any way to make Pg reliably detect I/O errorsand fail a checkpoint then redo and retry. To even fix this by PANICinglike I proposed originally, we need to know we have to PANIC.AFAICS it's completely unsafe to write(), close(), open() and fsync() andexpect that the fsync() makes any promises about the write(). Which if Iread Pg's low level storage code right, makes it completely unable toreliably detect I/O errors.When put it that way, it sounds fair enough too. How long is the kernelmeant to remember that there was a write error on the file triggered by awrite initiated by some seemingly unrelated process, some unbounded timeago, on a since-closed file?But it seems to put Pg on the fast track to O_DIRECT.From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-05 19:33:14On Thu, Apr 5, 2018 at 03:09:57PM +0800, Craig Ringer wrote:ENOSPC doesn't seem to be a concern during normal operation of major filesystems (ext3, ext4, btrfs, xfs) because they reserve space before returningfrom write(). But if a buffered write does manage to fail due to ENOSPC we'lldefinitely see the same problems. This makes ENOSPC on NFS a potentially datacorrupting condition since NFS doesn't preallocate space before returning fromwrite().This does explain why NFS has a reputation for unreliability forPostgres.From:Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>Date:2018-04-05 23:37:42Note: as I've brought up in another thread, it turns out that PG is nothandling fsync errors correctly even when the OS does do the rightthing (discovered by testing on FreeBSD).From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-06 01:27:05On 6 April 2018 at 07:37, Andrew Gierth wrote:Note: as I've brought up in another thread, it turns out that PG is nothandling fsync errors correctly even when the OS does do the rightthing (discovered by testing on FreeBSD).Yikes. For other readers, the related thread for this isMeanwhile, I've extended my test to run postgres on a deliberately faultyvolume and confirmed my results there.2018-04-06 01:11:40.555 UTC [58] LOG: checkpoint starting: immediate forcewait2018-04-06 01:11:40.567 UTC [58] ERROR: could not fsync file"base/12992/16386": Input/output error2018-04-06 01:11:40.655 UTC [66] ERROR: checkpoint request failed2018-04-06 01:11:40.655 UTC [66] HINT: Consult recent messages in theserver log for details.2018-04-06 01:11:40.655 UTC [66] STATEMENT: CHECKPOINTCheckpoint failed with checkpoint request failedHINT: Consult recent messages in the server log for details.Retrying2018-04-06 01:11:41.568 UTC [58] LOG: checkpoint starting: immediate forcewait2018-04-06 01:11:41.614 UTC [58] LOG: checkpoint complete: wrote 0 buffers(0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s,sync=0.000 s, total=0.046 s; sync files=3, longest=0.000 s, average=0.000s; distance=2727 kB, estimate=2779 kBGiven your report, now I have to wonder if we even reissued the fsync() atall this time. 'perf' time. OK, withsudo perf record -e syscalls:sys_enter_fsync,syscalls:sys_exit_fsync -asudo perf scriptI see the failed fync, then the same fd being fsync()d without error on thenext checkpoint, which succeeds. postgres 9602 [003] 72380.325817: syscalls:sys_enter_fsync: fd:0x00000005 postgres 9602 [003] 72380.325931: syscalls:sys_exit_fsync:0xfffffffffffffffb... postgres 9602 [000] 72381.336767: syscalls:sys_enter_fsync: fd:0x00000005 postgres 9602 [000] 72381.336840: syscalls:sys_exit_fsync: 0x0... and Pg continues merrily on its way without realising it lost data:[72379.834872] XFS (dm-0): writeback error on sector 118752[72380.324707] XFS (dm-0): writeback error on sector 118688In this test I set things up so the checkpointer would see the firstfsync() error. But if I make checkpoints less frequent, the bgwriteraggressive, and kernel dirty writeback aggressive, it should be possible tohave the failure go completely unobserved too. I'll try that next, becausewe've already largely concluded that the solution to the issue above is toPANIC on fsync() error. But if we don't see the error at all we're introuble.From:Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>Date:2018-04-06 02:53:56On Fri, Apr 6, 2018 at 1:27 PM, Craig Ringer wrote:On 6 April 2018 at 07:37, Andrew Gierth wrote:Note: as I've brought up in another thread, it turns out that PG is nothandling fsync errors correctly even when the OS does do the rightthing (discovered by testing on FreeBSD).Yikes. For other readers, the related thread for this isYeah. That's really embarrassing, especially after beating up onvarious operating systems all week. It's also an independent issue --let's keep that on the other thread and get it fixed.I see the failed fync, then the same fd being fsync()d without error on thenext checkpoint, which succeeds. postgres 9602 [003] 72380.325817: syscalls:sys_enter_fsync: fd:0x00000005 postgres 9602 [003] 72380.325931: syscalls:sys_exit_fsync:0xfffffffffffffffb... postgres 9602 [000] 72381.336767: syscalls:sys_enter_fsync: fd:0x00000005 postgres 9602 [000] 72381.336840: syscalls:sys_exit_fsync: 0x0... and Pg continues merrily on its way without realising it lost data:[72379.834872] XFS (dm-0): writeback error on sector 118752[72380.324707] XFS (dm-0): writeback error on sector 118688In this test I set things up so the checkpointer would see the first fsync()error. But if I make checkpoints less frequent, the bgwriter aggressive, andkernel dirty writeback aggressive, it should be possible to have the failurego completely unobserved too. I'll try that next, because we've alreadylargely concluded that the solution to the issue above is to PANIC onfsync() error. But if we don't see the error at all we're in trouble.I suppose you only see errors because the file descriptors linger openin the virtual file descriptor cache, which is a matter of luckdepending on how many relation segment files you touched. One thingyou could try to confirm our understand of the Linux 4.13+ policywould be to hack PostgreSQL so that it reopens the file descriptorevery time in mdsync(). See attached.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-06 03:20:22On 6 April 2018 at 10:53, Thomas Munro wrote:On Fri, Apr 6, 2018 at 1:27 PM, Craig Ringer wrote:On 6 April 2018 at 07:37, Andrew Gierth wrote:Note: as I've brought up in another thread, it turns out that PG is nothandling fsync errors correctly even when the OS does do the rightthing (discovered by testing on FreeBSD).Yikes. For other readers, the related thread for this is news-spur.riddles.org.ukYeah. That's really embarrassing, especially after beating up onvarious operating systems all week. It's also an independent issue --let's keep that on the other thread and get it fixed.I see the failed fync, then the same fd being fsync()d without error on thenext checkpoint, which succeeds. postgres 9602 [003] 72380.325817: syscalls:sys_enter_fsync: fd:0x00000005 postgres 9602 [003] 72380.325931: syscalls:sys_exit_fsync:0xfffffffffffffffb... postgres 9602 [000] 72381.336767: syscalls:sys_enter_fsync: fd:0x00000005 postgres 9602 [000] 72381.336840: syscalls:sys_exit_fsync: 0x0... and Pg continues merrily on its way without realising it lost data:[72379.834872] XFS (dm-0): writeback error on sector 118752[72380.324707] XFS (dm-0): writeback error on sector 118688In this test I set things up so the checkpointer would see the first fsync()error. But if I make checkpoints less frequent, the bgwriter aggressive, andkernel dirty writeback aggressive, it should be possible to have the failurego completely unobserved too. I'll try that next, because we've alreadylargely concluded that the solution to the issue above is to PANIC onfsync() error. But if we don't see the error at all we're in trouble.I suppose you only see errors because the file descriptors linger openin the virtual file descriptor cache, which is a matter of luckdepending on how many relation segment files you touched.In this case I think it's because the kernel didn't get around to doing thewriteback before the eagerly forced checkpoint fsync()'d it. Or we didn'teven queue it for writeback from our own shared_buffers until just beforewe fsync()'d it. After all, it's a contrived test case that tries toreproduce the issue rapidly with big writes and frequent checkpoints.So the checkpointer had the relation open to fsync() it, and it was thecheckpointer's fsync() that did writeback on the dirty page and noticed theerror.If we the kernel had done the writeback before the checkpointer opened therelation to fsync() it, we might not have seen the error at all - though asyou note this depends on the file descriptor cache. You can see thesilent-error behaviour in my standalone test case where I confirmed thepost-4.13 behaviour. (I'm on 4.14 here).I can try to reproduce it with postgres too, but it not only requiresclosing and reopening the FDs, it also requires forcing writeback beforeopening the fd. To make it occur in a practical timeframe I have to make mykernel writeback settings insanely aggressive and/or call sync() beforere-open()ing. I don't really think it's worth it, since I've confirmed thebehaviour already with the simpler test in standalone/ in the rest repo. Totry it yourself, clonehttps://github.com/ringerc/scrapcodeand in the master branchcd testcases/fsync-error-clearless READMEmake REOPEN=reopen standalone-runSeehttps://github.com/ringerc/scrapcode/blob/master/testcases/fsync-error-clear/standalone/fsync-error-clear.c#L118.I've pushed the postgres test to that repo too; "make postgres-run".You'll need docker, and be warned, it's using privileged docker containersand messing with dmsetup.From:Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>Date:2018-04-08 02:16:07So, what can we actually do about this new Linux behaviour?Idea 1:whenever you open a file, either tell the checkpointer so it canopen it too (and wait for it to tell you that it has done so, becauseit's not safe to write() until then), or send it a copy of the filedescriptor via IPC (since duplicated file descriptors share the samef_wb_err)if the checkpointer can't take any more file descriptors (how wouldthat limit even work in the IPC case?), then it somehow needs to tellyou that so that you know that you're responsible for fsyncing thatfile yourself, both on close (due to fd cache recycling) and also whenthe checkpointer tells you toMaybe it could be made to work, but sheesh, that seems horrible. Isthere some simpler idea along these lines that could make sure thatfsync() is only ever called on file descriptors that were openedbefore all unflushed writes, or file descriptors cloned from such filedescriptors?Idea 2:Give up, complain that this implementation is defective andunworkable, both on POSIX-compliance grounds and on POLA grounds, andcampaign to get it fixed more fundamentally (actual details left tothe experts, no point in speculating here, but we've seen a fewapproaches that work on other operating systems including keepingbuffers dirty and marking the whole filesystem broken/read-only).Idea 3:Give up on buffered IO and develop an O_SYNC | O_DIRECT based system ASAP.Any other ideas?For a while I considered suggesting an idea which I now think doesn'twork. I thought we could try asking for a new fcntl interface thatspits out wb_err counter. Call it an opaque error token or something.Then we could store it in our fsync queue and safely close the file.Check again before fsync()ing, and if we ever see a different value,PANIC because it means a writeback error happened while we weren'tlooking. Sadly I think it doesn't work because AIUI inodes are notpinned in kernel memory when no one has the file open and there are nodirty buffers, so I think the counters could go away and be reset.Perhaps you could keep inodes pinned by keeping the associated buffersdirty after an error (like FreeBSD), but if you did that you'd havesolved the problem already and wouldn't really need the wb_err systemat all. Is there some other idea long these lines that could work?From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-08 02:33:37On Sun, Apr 8, 2018 at 02:16:07PM +1200, Thomas Munro wrote:So, what can we actually do about this new Linux behaviour?Idea 1:whenever you open a file, either tell the checkpointer so it canopen it too (and wait for it to tell you that it has done so, becauseit's not safe to write() until then), or send it a copy of the filedescriptor via IPC (since duplicated file descriptors share the samef_wb_err)if the checkpointer can't take any more file descriptors (how wouldthat limit even work in the IPC case?), then it somehow needs to tellyou that so that you know that you're responsible for fsyncing thatfile yourself, both on close (due to fd cache recycling) and also whenthe checkpointer tells you toMaybe it could be made to work, but sheesh, that seems horrible. Isthere some simpler idea along these lines that could make sure thatfsync() is only ever called on file descriptors that were openedbefore all unflushed writes, or file descriptors cloned from such filedescriptors?Idea 2:Give up, complain that this implementation is defective andunworkable, both on POSIX-compliance grounds and on POLA grounds, andcampaign to get it fixed more fundamentally (actual details left tothe experts, no point in speculating here, but we've seen a fewapproaches that work on other operating systems including keepingbuffers dirty and marking the whole filesystem broken/read-only).Idea 3:Give up on buffered IO and develop an O_SYNC | O_DIRECT based system ASAP.Idea 4 would be for people to assume their database is corrupt if theirserver logs report any I/O error on the file systems Postgres uses.From:Christophe Pettus <xof(at)thebuild(dot)com>Date:2018-04-08 02:37:47On Apr 7, 2018, at 19:33, Bruce Momjian wrote:Idea 4 would be for people to assume their database is corrupt if theirserver logs report any I/O error on the file systems Postgres uses.Pragmatically, that's where we are right now. The best answer in this bad situation is (a) fix the error, then (b) replay from a checkpoint before the error occurred, but it appears we can't even guarantee that a PostgreSQL process will be the one to see the error.---- Christophe Pettus xof(at)thebuild(dot)comFrom:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-08 03:27:45On 8 April 2018 at 10:16, Thomas Munro wrote:So, what can we actually do about this new Linux behaviour?Yeah, I've been cooking over that myself.More below, but here's an idea #5: decide InnoDB has the right idea, and goto using a single massive blob file, or a few giant blobs.We have a storage abstraction that makes this way, way less painful than itshould be.We can virtualize relfilenodes into storage extents in relatively few bigfiles. We could use sparse regions to make the addressing more convenient,but that makes copying and backup painful, so I'd rather not.Even one file per tablespace for persistent relation heaps, another forindexes, another for each fork type.That way we can use something like your #1 (which is what I was alsothinking about then rejecting previously), but reduce the pain by reducingthe FD count drastically so exhausting FDs stops being a problem.Previously I was leaning toward what you've described here:whenever you open a file, either tell the checkpointer so it canopen it too (and wait for it to tell you that it has done so, becauseit's not safe to write() until then), or send it a copy of the filedescriptor via IPC (since duplicated file descriptors share the samef_wb_err)if the checkpointer can't take any more file descriptors (how wouldthat limit even work in the IPC case?), then it somehow needs to tellyou that so that you know that you're responsible for fsyncing thatfile yourself, both on close (due to fd cache recycling) and also whenthe checkpointer tells you toMaybe it could be made to work, but sheesh, that seems horrible. Isthere some simpler idea along these lines that could make sure thatfsync() is only ever called on file descriptors that were openedbefore all unflushed writes, or file descriptors cloned from such filedescriptors?... and got stuck on "yuck, that's awful".I was assuming we'd force early checkpoints if the checkpointer hit its fdlimit, but that's even worse.We'd need to urgently do away with segmented relations, and partitionswould start to become a hinderance.Even then it's going to be an unworkable nightmare with heavily partitionedsystems, systems that use schema-sharding, etc. And it'll mean we need toplay with process limits and, often, system wide limits on FDs. I imaginethe performance implications won't be pretty.Idea 2:Give up, complain that this implementation is defective andunworkable, both on POSIX-compliance grounds and on POLA grounds, andcampaign to get it fixed more fundamentally (actual details left tothe experts, no point in speculating here, but we've seen a fewapproaches that work on other operating systems including keepingbuffers dirty and marking the whole filesystem broken/read-only).This appears to be what SQLite does AFAICS.https://www.sqlite.org/atomiccommit.htmlthough it has the huge luxury of a single writer, so it's probably onlysubject to the original issue not the multiprocess / checkpointer issues weface.Idea 3:Give up on buffered IO and develop an O_SYNC | O_DIRECT based system ASAP.That seems to be what the kernel folks will expect. But that's going toKILL performance. We'll need writer threads to have any hope of it nottotally sucking, because otherwise simple things like updating a heaptuple and two related indexes will incur enormous disk latencies.But I suspect it's the path forward.Goody.Any other ideas?For a while I considered suggesting an idea which I now think doesn'twork. I thought we could try asking for a new fcntl interface thatspits out wb_err counter. Call it an opaque error token or something.Then we could store it in our fsync queue and safely close the file.Check again before fsync()ing, and if we ever see a different value,PANIC because it means a writeback error happened while we weren'tlooking. Sadly I think it doesn't work because AIUI inodes are notpinned in kernel memory when no one has the file open and there are nodirty buffers, so I think the counters could go away and be reset.Perhaps you could keep inodes pinned by keeping the associated buffersdirty after an error (like FreeBSD), but if you did that you'd havesolved the problem already and wouldn't really need the wb_err systemat all. Is there some other idea long these lines that could work?I think our underlying data syncing concept is fundamentally broken, andit's not really the kernel's fault.We assume that we can safely:procA: open()procA: write()procA: close()... some long time later, unbounded as far as the kernel is concerned ...procB: open()procB: fsync()procB: close()If the kernel does writeback in the middle, how on earth is it supposed toknow we expect to reopen the file and check back later?Should it just remember "this file had an error" forever, and tell everycaller? In that case how could we recover? We'd need some new API to say"yeah, ok already, I'm redoing all my work since the last good fsync() soyou can clear the error flag now". Otherwise it'd keep reporting an errorafter we did redo to recover, too.I never really clicked to the fact that we closed relations with pendingbuffered writes, left them closed, then reopened them to fsync. That's ....well, the kernel isn't the only thing doing crazy things here.Right now I think we're at option (4): If you see anything that smells likea write error in your kernel logs, hard-kill postgres with -m immediate (doNOT let it do a shutdown checkpoint). If it did a checkpoint since thelogs, fake up a backup label to force redo to start from the lastcheckpoint before the error. Otherwise, it's safe to just let it start upagain and do redo again.Fun times.This also means AFAICS that running Pg on NFS is extremely unsafe, you MUSTmake sure you don't run out of disk. Because the usual safeguard of spacereservation against ENOSPC in fsync doesn't apply to NFS. (I haven't testedthis with nfsv3 in sync,hard,nointr mode yet, maybe that's safe, but Idoubt it). The same applies to thin-provisioned storage. Just. Don't.This helps explain various reports of corruption in Docker and variousother tools that use various sorts of thin provisioning. If you hit ENOSPCin fsync(), bye bye data.From:Peter Geoghegan <pg(at)bowt(dot)ie>Date:2018-04-08 03:37:06On Sat, Apr 7, 2018 at 8:27 PM, Craig Ringer wrote:More below, but here's an idea #5: decide InnoDB has the right idea, and goto using a single massive blob file, or a few giant blobs.We have a storage abstraction that makes this way, way less painful than itshould be.We can virtualize relfilenodes into storage extents in relatively few bigfiles. We could use sparse regions to make the addressing more convenient,but that makes copying and backup painful, so I'd rather not.Even one file per tablespace for persistent relation heaps, another forindexes, another for each fork type.I'm not sure that we can do that now, since it would break the new"Optimize btree insertions for common case of increasing values"optimization. (I did mention this before it went in.)I've asked Pavan to at least add a note to the nbtree README thatexplains the high level theory behind the optimization, as part ofpost-commit clean-up. I'll ask him to say something about how it mightaffect extent-based storage, too.From:Christophe Pettus <xof(at)thebuild(dot)com>Date:2018-04-08 03:46:17On Apr 7, 2018, at 20:27, Craig Ringer wrote:Right now I think we're at option (4): If you see anything that smells like a write error in your kernel logs, hard-kill postgres with -m immediate (do NOT let it do a shutdown checkpoint). If it did a checkpoint since the logs, fake up a backup label to force redo to start from the last checkpoint before the error. Otherwise, it's safe to just let it start up again and do redo again.Before we spiral down into despair and excessive alcohol consumption, this is basically the same situation as a checksum failure or some other kind of uncorrected media-level error. The bad part is that we have to find out from the kernel logs rather than from PostgreSQL directly. But this does not strike me as otherwise significantly different from, say, an infrequently-accessed disk block reporting an uncorrectable error when we finally get around to reading it.From:Andreas Karlsson <andreas(at)proxel(dot)se>Date:2018-04-08 09:41:06On 04/08/2018 05:27 AM, Craig Ringer wrote:>More below, but here's an idea #5: decide InnoDB has the right idea, and go to using a single massive blob file, or a few giant blobs.FYI: MySQL has by default one file per table these days. The oldapproach with one massive file was a maintenance headache so they changethe default some releases ago.https://dev.mysql.com/doc/refman/8.0/en/innodb-multiple-tablespaces.htmlFrom:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-08 10:30:31On 8 April 2018 at 11:46, Christophe Pettus wrote:On Apr 7, 2018, at 20:27, Craig Ringer wrote:Right now I think we're at option (4): If you see anything that smellslike a write error in your kernel logs, hard-kill postgres with -mimmediate (do NOT let it do a shutdown checkpoint). If it did a checkpointsince the logs, fake up a backup label to force redo to start from the lastcheckpoint before the error. Otherwise, it's safe to just let it start upagain and do redo again.Before we spiral down into despair and excessive alcohol consumption, thisis basically the same situation as a checksum failure or some other kind ofuncorrected media-level error. The bad part is that we have to find outfrom the kernel logs rather than from PostgreSQL directly. But this doesnot strike me as otherwise significantly different from, say, aninfrequently-accessed disk block reporting an uncorrectable error when wefinally get around to reading it.I don't entirely agree - because it affects ENOSPC, I/O errors on thinprovisioned storage, I/O errors on multipath storage, etc. (I identifiedthe original issue on a thin provisioned system that ran out of backingspace, mangling PostgreSQL in a way that made no sense at the time).These are way more likely than bit flips or other storage level corruption,and things that we previously expected to detect and fail gracefully for.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-08 10:31:24On 8 April 2018 at 17:41, Andreas Karlsson wrote:On 04/08/2018 05:27 AM, Craig Ringer wrote:> More below, but here's anidea #5: decide InnoDB has the right idea, andgo to using a single massive blob file, or a few giant blobs.FYI: MySQL has by default one file per table these days. The old approachwith one massive file was a maintenance headache so they change the defaultsome releases ago.https://dev.mysql.com/doc/refman/8.0/en/innodb-multiple-tablespaces.htmlHuh, thanks for the update.We should see how they handle reliable flushing and see if they've lookedinto it. If they haven't, we should give them a heads-up and if they have,lets learn from them.From:Christophe Pettus <xof(at)thebuild(dot)com>Date:2018-04-08 16:38:03On Apr 8, 2018, at 03:30, Craig Ringer wrote:These are way more likely than bit flips or other storage level corruption, and things that we previously expected to detect and fail gracefully for.This is definitely bad, and it explains a few otherwise-inexplicable corruption issues we've seen. (And great work tracking it down!) I think it's important not to panic, though; PostgreSQL doesn't have a reputation for horrible data integrity. I'm not sure it makes sense to do a major rearchitecting of the storage layer (especially with pluggable storage coming along) to address this. While the failure modes are more common, the solution (a PITR backup) is one that an installation should have anyway against media failures.From:Greg Stark <stark(at)mit(dot)edu>Date:2018-04-08 21:23:21On 8 April 2018 at 04:27, Craig Ringer wrote:On 8 April 2018 at 10:16, Thomas Munro wrote:If the kernel does writeback in the middle, how on earth is it supposed toknow we expect to reopen the file and check back later?Should it just remember "this file had an error" forever, and tell everycaller? In that case how could we recover? We'd need some new API to say"yeah, ok already, I'm redoing all my work since the last good fsync() soyou can clear the error flag now". Otherwise it'd keep reporting an errorafter we did redo to recover, too.There is no spoon^H^H^H^H^Herror flag. We don't need fsync to keeptrack of any errors. We just need fsync to accurately report whetherall the buffers in the file have been written out. When you call fsyncagain the kernel needs to initiate i/o on all the dirty buffers andblock until they complete successfully. If they complete successfullythen nobody cares whether they had some failure in the past when i/owas initiated at some point in the past.The problem is not that errors aren't been tracked correctly. Theproblem is that dirty buffers are being marked clean when they haven'tbeen written out. They consider dirty filesystem buffers when there'shardware failure preventing them from being written "a memory leak".As long as any error means the kernel has discarded writes thenthere's no real hope of any reliable operation through that interface.Going to DIRECTIO is basically recognizing this. That the kernelfilesystem buffer provides no reliable interface so we need toreimplement it ourselves in user space.It's rather disheartening. Aside from having to do all that work wehave the added barrier that we don't have as much information aboutthe hardware as the kernel has. We don't know where raid stripes beginand end, how big the memory controller buffers are or how to tell whenthey're full or empty or how to flush them. etc etc. We also don'tknow what else is going on on the machine.From:Christophe Pettus <xof(at)thebuild(dot)com>Date:2018-04-08 21:28:43On Apr 8, 2018, at 14:23, Greg Stark wrote:They consider dirty filesystem buffers when there'shardware failure preventing them from being written "a memory leak".That's not an irrational position. File system buffers are not dedicated memory for file system caching; they're being used for that because no one has a better use for them at that moment. If an inability to flush them to disk meant that they suddenly became pinned memory, a large copy operation to a yanked USB drive could result in the system having no more allocatable memory. I guess in theory that they could swap them, but swapping out a file system buffer in hopes that sometime in the future it could be properly written doesn't seem very architecturally sound to me.From:Anthony Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-04-08 21:47:04On Sun, Apr 08, 2018 at 10:23:21PM +0100, Greg Stark wrote:On 8 April 2018 at 04:27, Craig Ringer wrote:On 8 April 2018 at 10:16, Thomas Munro wrote:If the kernel does writeback in the middle, how on earth is it supposed toknow we expect to reopen the file and check back later?Should it just remember "this file had an error" forever, and tell everycaller? In that case how could we recover? We'd need some new API to say"yeah, ok already, I'm redoing all my work since the last good fsync() soyou can clear the error flag now". Otherwise it'd keep reporting an errorafter we did redo to recover, too.There is no spoon^H^H^H^H^Herror flag. We don't need fsync to keeptrack of any errors. We just need fsync to accurately report whetherall the buffers in the file have been written out. When you call fsyncInstead, fsync() reports when some of the buffers have not beenwritten out, due to reasons outlined before. As such it may makesome sense to maintain some tracking regarding errors even aftermarking failed dirty pages as clean (in fact it has been proposed,but this introduces memory overhead).again the kernel needs to initiate i/o on all the dirty buffers andblock until they complete successfully. If they complete successfullythen nobody cares whether they had some failure in the past when i/owas initiated at some point in the past.The question is, what should the kernel and application do in caseswhere this is simply not possible (according to freebsd that keepsdirty pages around after failure, for example, -EIO from the blocklayer is a contract for unrecoverable errors so it is pointless tokeep them dirty). You'd need a specialized interface to clear-outthe errors (and drop the dirty pages), or potentially just remountthe filesystem.The problem is not that errors aren't been tracked correctly. Theproblem is that dirty buffers are being marked clean when they haven'tbeen written out. They consider dirty filesystem buffers when there'shardware failure preventing them from being written "a memory leak".As long as any error means the kernel has discarded writes thenthere's no real hope of any reliable operation through that interface.This does not necessarily follow. Whether the kernel discards writesor not would not really help (see above). It is more a matter ofproper "reporting contract" between userspace and kernel, and trackingwould be a way for facilitating this vs. having a more complex userspacescheme (as described by others in this thread) where synchronizationfor fsync() is required in a multi-process application.From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-08 22:29:16On Sun, Apr 8, 2018 at 09:38:03AM -0700, Christophe Pettus wrote:On Apr 8, 2018, at 03:30, Craig Ringer wrote:These are way more likely than bit flips or other storage levelcorruption, and things that we previously expected to detect andfail gracefully for.This is definitely bad, and it explains a few otherwise-inexplicablecorruption issues we've seen. (And great work tracking it down!) Ithink it's important not to panic, though; PostgreSQL doesn't have areputation for horrible data integrity. I'm not sure it makes senseto do a major rearchitecting of the storage layer (especially withpluggable storage coming along) to address this. While the failuremodes are more common, the solution (a PITR backup) is one that aninstallation should have anyway against media failures.I think the big problem is that we don't have any way of stoppingPostgres at the time the kernel reports the errors to the kernel log, sowe are then returning potentially incorrect results and committingtransactions that might be wrong or lost. If we could stop Postgreswhen such errors happen, at least the administrator could fix theproblem of fail-over to a standby.An crazy idea would be to have a daemon that checks the logs and stopsPostgres when it seems something wrong.From:Christophe Pettus <xof(at)thebuild(dot)com>Date:2018-04-08 23:10:24On Apr 8, 2018, at 15:29, Bruce Momjian wrote:I think the big problem is that we don't have any way of stoppingPostgres at the time the kernel reports the errors to the kernel log, sowe are then returning potentially incorrect results and committingtransactions that might be wrong or lost.Yeah, it's bad. In the short term, the best advice to installations is to monitor their kernel logs for errors (which very few do right now), and make sure they have a backup strategy which can encompass restoring from an error like this. Even Craig's smart fix of patching the backup label to recover from a previous checkpoint doesn't do much good if we don't have WAL records back that far (or one of the required WAL records also took a hit).In the longer term... O_DIRECT seems like the most plausible way out of this, but that might be popular with people running on file systems or OSes that don't have this issue. (Setting aside the daunting prospect of implementing that.)From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-08 23:16:25On 2018-04-08 18:29:16 -0400, Bruce Momjian wrote:On Sun, Apr 8, 2018 at 09:38:03AM -0700, Christophe Pettus wrote:On Apr 8, 2018, at 03:30, Craig Ringer wrote:These are way more likely than bit flips or other storage levelcorruption, and things that we previously expected to detect andfail gracefully for.This is definitely bad, and it explains a few otherwise-inexplicablecorruption issues we've seen. (And great work tracking it down!) Ithink it's important not to panic, though; PostgreSQL doesn't have areputation for horrible data integrity. I'm not sure it makes senseto do a major rearchitecting of the storage layer (especially withpluggable storage coming along) to address this. While the failuremodes are more common, the solution (a PITR backup) is one that aninstallation should have anyway against media failures.I think the big problem is that we don't have any way of stoppingPostgres at the time the kernel reports the errors to the kernel log, sowe are then returning potentially incorrect results and committingtransactions that might be wrong or lost. If we could stop Postgreswhen such errors happen, at least the administrator could fix theproblem of fail-over to a standby.An crazy idea would be to have a daemon that checks the logs and stopsPostgres when it seems something wrong.I think the danger presented here is far smaller than some of thestatements in this thread might make one think. In all likelihood, onceyou've got an IO error that kernel level retries don't fix, yourdatabase is screwed. Whether fsync reports that or not is reallysomewhat besides the point. We don't panic that way when getting IOerrors during reads either, and they're more likely to be persistentthan errors during writes (because remapping on storage layer can fixissues, but not during reads).There's a lot of not so great things here, but I don't think there's anyneed to panic.We should fix things so that reported errors are treated with crashrecovery, and for the rest I think there's very fair arguments to bemade that that's far outside postgres's remit.I think there's pretty good reasons to go to direct IO where supported,but error handling doesn't strike me as a particularly good reason forthe move.From:Christophe Pettus <xof(at)thebuild(dot)com>Date:2018-04-08 23:27:57On Apr 8, 2018, at 16:16, Andres Freund wrote:We don't panic that way when getting IOerrors during reads either, and they're more likely to be persistentthan errors during writes (because remapping on storage layer can fixissues, but not during reads).There is a distinction to be drawn there, though, because we immediately pass an error back to the client on a read, but a write problem in this situation can be masked for an extended period of time.That being said...There's a lot of not so great things here, but I don't think there's anyneed to panic.No reason to panic, yes. We can assume that if this was a very big persistent problem, it would be much more widely reported. It would, however, be good to find a way to get the error surfaced back up to the client in a way that is not just monitoring the kernel logs.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-09 01:31:56On 9 April 2018 at 05:28, Christophe Pettus wrote:On Apr 8, 2018, at 14:23, Greg Stark wrote:They consider dirty filesystem buffers when there'shardware failure preventing them from being written "a memory leak".That's not an irrational position. File system buffers are notdedicated memory for file system caching; they're being used for thatbecause no one has a better use for them at that moment. If an inabilityto flush them to disk meant that they suddenly became pinned memory, alarge copy operation to a yanked USB drive could result in the systemhaving no more allocatable memory. I guess in theory that they could swapthem, but swapping out a file system buffer in hopes that sometime in thefuture it could be properly written doesn't seem very architecturally soundto me.Yep.Another example is a write to an NFS or iSCSI volume that goes awayforever. What if the app keeps write()ing in the hopes it'll come back, andby the time the kernel starts reporting EIO for write(), it's alreadysaddled with a huge volume of dirty writeback buffers it can't get rid ofbecause someone, one day, might want to know about them?You could make the argument that it's OK to forget if the entire filesystem goes away. But actually, why is that ok? What if it's remountedagain? That'd be really bad too, for someone expecting write reliability.You can coarsen from dirty buffer tracking to marking the FD(s) bad, butwhat if there's no FD to mark because the file isn't open at the moment?You can mark the inode cache entry and pin it, I guess. But what if yourapp triggered I/O errors over vast numbers of small files? Again, thekernel's left holding the ball.It doesn't know if/when an app will return to check. It doesn't know howlong to remember the failure for. It doesn't know when all interestedclients have been informed and it can treat the fault as cleared/repaired,either, so it'd have to keep on reporting EIO for PostgreSQL's own writesand fsyncs() indefinitely, even once we do recovery.The only way it could avoid that would be to keep the dirty writeback pagesaround and flagged bad, then clear the flag when a new write() replaces thesame file range. I can't imagine that being practical.Blaming the kernel for this sure is the easy way out.But IMO we cannot rationally expect the kernel to remember error stateforever for us, then forget it when we expect, all without actually tellingit anything about our activities or even that we still exist and are stillinterested in the files/writes. We've closed the files and gone away.Whatever we do, it's likely going to have to involve not doing that anymore.Even if we can somehow convince the kernel folks to add a new interface forus that reports I/O errors to some listener, like aninotify/fnotify/dnotify/whatever-it-is-today-notify extension reportingerrors in buffered async writes, we won't be able to rely on having it for5-10 years, and only on Linux.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-09 01:35:06On 9 April 2018 at 06:29, Bruce Momjian wrote:I think the big problem is that we don't have any way of stoppingPostgres at the time the kernel reports the errors to the kernel log, sowe are then returning potentially incorrect results and committingtransactions that might be wrong or lost.Right.Specifically, we need a way to ask the kernel at checkpoint time "waseverything written to [this set of files] flushed successfully since thelast time I asked, no matter who did the writing and no matter how thewrites were flushed?"If the result is "no" we PANIC and redo. If the hardware/volume is screwed,the user can fail over to a standby, do PITR, etc.But we don't have any way to ask that reliably at present.From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-09 01:55:10Hi,On 2018-04-08 16:27:57 -0700, Christophe Pettus wrote:On Apr 8, 2018, at 16:16, Andres Freund wrote:We don't panic that way when getting IOerrors during reads either, and they're more likely to be persistentthan errors during writes (because remapping on storage layer can fixissues, but not during reads).There is a distinction to be drawn there, though, because weimmediately pass an error back to the client on a read, but a writeproblem in this situation can be masked for an extended period oftime.Only if you're "lucky" enough that your clients actually read that data,and then you're somehow able to figure out across the whole stack thatthese 0.001% of transactions that fail are due to IO errors. Or you alsoneed to do log analysis.If you want to solve things like that you need regular reads of all yourdata, including verifications etc.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-09 02:00:41On 9 April 2018 at 07:16, Andres Freund wrote:I think the danger presented here is far smaller than some of thestatements in this thread might make one think.Clearly it's not happening a huge amount or we'd have a lot of noise aboutPg eating people's data, people shouting about how unreliable it is, etc.We don't. So it's not some earth shattering imminent threat to everyone'sdata. It's gone unnoticed, or the root cause unidentified, for a long time.I suspect we've written off a fair few issues in the past as "it'd badhardware" when actually, the hardware fault was the trigger for a Pg/kernelinteraction bug. And blamed containers for things that weren't really thecontainer's fault. But even so, if it were happening tons, we'd hear morenoise.I've already been very surprised there when I learned that PostgreSQLcompletely ignores wholly absent relfilenodes. Specifically, if youunlink() a relation's backing relfilenode while Pg is down and that filehas writes pending in the WAL. We merrily re-create it with uninitalizedpages and go on our way. As Andres pointed out in an offlist discussion,redo isn't a consistency check, and it's not obliged to fail in such cases.We can say "well, don't do that then" and define away file losses from FScorruption etc as not our problem, the lower levels we expect to take careof this have failed.We have to look at what checkpoints are and are not supposed to promise,and whether this is a problem we just define away as "not our problem, thelower level failed, we're not obliged to detect this and fail gracefully."We can choose to say that checkpoints are required to guarantee crash/powerloss safety ONLY and do not attempt to protect against I/O errors of anysort. In fact, I think we should likely amend the documentation for releaseversions to say just that.In all likelihood, onceyou've got an IO error that kernel level retries don't fix, yourdatabase is screwed.Your database is going to be down or have interrupted service. It'spossible you may have some unreadable data. This could result in localiseddamage to one or more relations. That could affect FK relationships,indexes, all sorts. If you're really unlucky you might lose somethingcritical like pg_clog/ contents.But in general your DB should be repairable/recoverable even in those cases.And in many failure modes there's no reason to expect any data loss at all,like:Local disk fills up (seems to be safe already due to space reservation at write() time)Thin-provisioned storage backing local volume iSCSI or paravirt block device fills upNFS volume fills upMultipath I/O errorInterruption of connectivity to network block deviceDisk develops localized bad sector where we haven't previously written dataExcept for the ENOSPC on NFS, all the rest of the cases can be handled byexpecting the kernel to retry forever and not return until the block iswritten or we reach the heat death of the universe. And NFS, well...Part of the trouble is that the kernel won't retry forever in all thesecases, and doesn't seem to have a way to ask it to in all cases.And if the user hasn't configured it for the right behaviour in terms ofI/O error resilience, we don't find out about it.So it's not the end of the world, but it'd sure be nice to fix.Whether fsync reports that or not is reallysomewhat besides the point. We don't panic that way when getting IOerrors during reads either, and they're more likely to be persistentthan errors during writes (because remapping on storage layer can fixissues, but not during reads).That's because reads don't make promises about what's committed and synced.I think that's quite different.We should fix things so that reported errors are treated with crashrecovery, and for the rest I think there's very fair arguments to bemade that that's far outside postgres's remit.Certainly for current versions.I think we need to think about a more robust path in future. But it'scertainly not "stop the world" territory.The docs need an update to indicate that we explicitly disclaimresponsibility for I/O errors on async writes, and that the kernel and I/Ostack must be configured never to give up on buffered writes. If it does,that's not our problem anymore.From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-09 02:06:12On 2018-04-09 10:00:41 +0800, Craig Ringer wrote:I suspect we've written off a fair few issues in the past as "it'd badhardware" when actually, the hardware fault was the trigger for a Pg/kernelinteraction bug. And blamed containers for things that weren't really thecontainer's fault. But even so, if it were happening tons, we'd hear morenoise.Agreed on that, but I think that's FAR more likely to be things likemultixacts, index structure corruption due to logic bugs etc.I've already been very surprised there when I learned that PostgreSQLcompletely ignores wholly absent relfilenodes. Specifically, if youunlink() a relation's backing relfilenode while Pg is down and that filehas writes pending in the WAL. We merrily re-create it with uninitalizedpages and go on our way. As Andres pointed out in an offlist discussion,redo isn't a consistency check, and it's not obliged to fail in such cases.We can say "well, don't do that then" and define away file losses from FScorruption etc as not our problem, the lower levels we expect to take careof this have failed.And it'd be a realy bad idea to behave differently.And in many failure modes there's no reason to expect any data loss at all,like:Local disk fills up (seems to be safe already due to space reservation atwrite() time)That definitely should be treated separately.Thin-provisioned storage backing local volume iSCSI or paravirt block device fills upNFS volume fills upThose should be the same as the above.I think we need to think about a more robust path in future. But it'scertainly not "stop the world" territory.I think you're underestimating the complexity of doing that by at leasttwo orders of magnitude.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-09 03:15:01On 9 April 2018 at 10:06, Andres Freund wrote:And in many failure modes there's no reason to expect any data loss at all,like:Local disk fills up (seems to be safe already due to space reservation at write() time)That definitely should be treated separately.It is, because all the FSes I looked at reserve space before returning fromwrite(), even if they do delayed allocation. So they won't fail with ENOSPCat fsync() time or silently due to lost errors on background writeback.Otherwise we'd be hearing a LOT more noise about this.Thin-provisioned storage backing local volume iSCSI or paravirt block device fills upNFS volume fills upThose should be the same as the above.Unfortunately, they aren't.AFAICS NFS doesn't reserve space with the other end before returning fromwrite(), even if mounted with the sync option. So we can get ENOSPC lazilywhen the buffer writeback fails due to a full backing file system. Thisthen travels the same paths as EIO: we fsync(), ERROR, retry, appear tosucceed, and carry on with life losing the data. Or we never hear about theerror in the first place.(There's a proposed extension that'd allow this, seehttps://tools.ietf.org/html/draft-iyer-nfsv4-space-reservation-ops-02#page-5,but I see no mention of it in fs/nfs. All the reserve_space /xdr_reserve_space stuff seems to be related to space in protocol messagesat a quick read.)Thin provisioned storage could vary a fair bit depending on theimplementation. But the specific failure case I saw, prompting this thread,was on a volume using the stack:xfs -> lvm2 -> multipath -> ??? -> SAN(the HBA/iSCSI/whatever was not recorded by the looks, but IIRC it wasiSCSI. I'm checking.)The SAN ran out of space. Due to use of thin provisioning, Linux thoughtthere was plenty of space on the volume; LVM thought it had plenty ofphysical extents free and unallocated, XFS thought there was tons of freespace, etc. The space exhaustion manifested as I/O errors on flushes ofwriteback buffers.The logs were like this:kernel: sd 2:0:0:1: [sdd] Unhandled sense codekernel: sd 2:0:0:1: [sdd]kernel: Result: hostbyte=DID_OK driverbyte=DRIVER_SENSEkernel: sd 2:0:0:1: [sdd]kernel: Sense Key : Data Protect [current]kernel: sd 2:0:0:1: [sdd]kernel: Add. Sense: Space allocation failed write protectkernel: sd 2:0:0:1: [sdd] CDB:kernel: Write(16): **HEX-DATA-CUT-OUT**kernel: Buffer I/O error on device dm-0, logical block 3098338786kernel: lost page write due to I/O error on dm-0kernel: Buffer I/O error on device dm-0, logical block 3098338787The immediate cause was that Linux's multipath driver didn't seem torecognise the sense code as retryable, so it gave up and reported it to thenext layer up (LVM). LVM and XFS both seem to think that the lower layer isresponsible for retries, so they toss the write away, and tell anyinterested writers if they feel like it, per discussion upthread.In this case Pg did get the news and reported fsync() errors oncheckpoints, but it only reported an error once per relfilenode. Once itran out of failed relfilenodes to cause the checkpoint to ERROR, it"completed" a "successful" checkpoint and kept on running until theresulting corruption started to manifest its self and it segfaulted sometime later. As we've now learned, there's no guarantee we'd even get thenews about the I/O errors at all.WAL was on a separate volume that didn't run out of room immediately, so wedidn't PANIC on WAL write failure and prevent the issue.In this case if Pg had PANIC'd (and been able to guarantee to get the newsof write failures reliably), there'd have been no corruption and no dataloss despite the underlying storage issue.If, prior to seeing this, you'd asked me "will my PostgreSQL database becorrupted if my thin-provisioned volume runs out of space" I'd have said"Surely not. PostgreSQL won't be corrupted by running out of disk space, itorders writes carefully and forces flushes so that it will recovergracefully from write failures."Except not. I was very surprised.BTW, it also turns out that the default for multipath is to give up onerrors anyway; see the queue_if_no_path option and no_path_retries options.(Hint: run PostgreSQL with no_path_retries=queue). That's a sane default ifyou use O_DIRECT|O_SYNC, and otherwise pretty much a data-eating setup.I regularly see rather a lot of multipath systems, iSCSI systems, SANbacked systems, etc. I think we need to be pretty clear that we expect themto retry indefinitely, and if they report an I/O error we cannot reliablyhandle it. We need to patch Pg to PANIC on any fsync() failure and documentthat Pg won't notice some storage failure modes that might otherwise beconsidered nonfatal or transient, so very specific storage configurationand testing is required. (Not that anyone will do it). Also warn againstrunning on NFS even with "hard,sync,nointr".It'd be interesting to have a tool that tested error handling, allowingpeople to do iSCSI plug-pull tests, that sort of thing. But as far as I cantell nobody ever tests their storage stack anyway, so I don't plan onwriting something that'll never get used.I think we need to think about a more robust path in future. But it'scertainly not "stop the world" territory.I think you're underestimating the complexity of doing that by at leasttwo orders of magnitude.Oh, it's just a minor total rewrite of half Pg, no big deal ;)I'm sure that no matter how big I think it is, I'm still underestimating it.The most workable option IMO would be some sort of fnotify/dnotify/whateverthat reports all I/O errors on a volume. Some kind of error reportinghandle we can keep open on a volume level that we can check for eachvolume/tablespace after we fsync() everything to see if it all reallyworked. If we PANIC if that gives us a bad answer, and PANIC on fsyncerrors, we guard against the great majority of these sorts ofshould-be-transient-if-the-kernel-didn't-give-up-and-throw-away-our-dataerrors.Even then, good luck getting those events from an NFS volume in which thebacking volume experiences an issue.And it's kind of moot because AFAICS no such interface exists.From:Greg Stark <stark(at)mit(dot)edu>Date:2018-04-09 08:45:40On 8 April 2018 at 22:47, Anthony Iliopoulos wrote:On Sun, Apr 08, 2018 at 10:23:21PM +0100, Greg Stark wrote:On 8 April 2018 at 04:27, Craig Ringer wrote:On 8 April 2018 at 10:16, Thomas Munro The question is, what should the kernel and application do in caseswhere this is simply not possible (according to freebsd that keepsdirty pages around after failure, for example, -EIO from the blocklayer is a contract for unrecoverable errors so it is pointless tokeep them dirty). You'd need a specialized interface to clear-outthe errors (and drop the dirty pages), or potentially just remountthe filesystem.Well firstly that's not necessarily the question. ENOSPC is not anunrecoverable error. And even unrecoverable errors for a single writedoesn't mean the write will never be able to succeed in the future.But secondly doesn't such an interface already exist? When the deviceis dropped any dirty pages already get dropped with it. What's thepoint in dropping them but keeping the failing device?But just to underline the point. "pointless to keep them dirty" isexactly backwards from the application's point of view. If the errorwriting to persistent media really is unrecoverable then it's all themore critical that the pages be kept so the data can be copied to someother device. The last thing user space expects to happen is if thedata can't be written to persistent storage then also immediatelydelete it from RAM. (And the really last thing user space expects isfor this to happen and return no error.)From:Anthony Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-04-09 10:50:41On Mon, Apr 09, 2018 at 09:45:40AM +0100, Greg Stark wrote:On 8 April 2018 at 22:47, Anthony Iliopoulos wrote:On Sun, Apr 08, 2018 at 10:23:21PM +0100, Greg Stark wrote:On 8 April 2018 at 04:27, Craig Ringer wrote:On 8 April 2018 at 10:16, Thomas Munro The question is, what should the kernel and application do in caseswhere this is simply not possible (according to freebsd that keepsdirty pages around after failure, for example, -EIO from the blocklayer is a contract for unrecoverable errors so it is pointless tokeep them dirty). You'd need a specialized interface to clear-outthe errors (and drop the dirty pages), or potentially just remountthe filesystem.Well firstly that's not necessarily the question. ENOSPC is not anunrecoverable error. And even unrecoverable errors for a single writedoesn't mean the write will never be able to succeed in the future.To make things a bit simpler, let us focus on EIO for the moment.The contract between the block layer and the filesystem layer isassumed to be that of, when an EIO is propagated up to the fs,then you may assume that all possibilities for recovering havebeen exhausted in lower layers of the stack. Mind you, I am notclaiming that this contract is either documented or necessarilyrespected (in fact there have been studies on the error propagationand handling of the block layer, see [1]). Let us assume thatthis is the design contract though (which appears to be the caseacross a number of open-source kernels), and if not - it's a bug.In this case, indeed the specific write()s will never be ableto succeed in the future, at least not as long as the BIOs areallocated to the specific failing LBAs.But secondly doesn't such an interface already exist? When the deviceis dropped any dirty pages already get dropped with it. What's thepoint in dropping them but keeping the failing device?I think there are degrees of failure. There are certainly caseswhere one may encounter localized unrecoverable medium errors(specific to certain LBAs) that are non-maskable from the blocklayer and below. That does not mean that the device is droppedat all, so it does make sense to continue all other operationsto all other regions of the device that are functional. In casesof total device failure, then the filesystem will prevent youfrom proceeding anyway.But just to underline the point. "pointless to keep them dirty" isexactly backwards from the application's point of view. If the errorwriting to persistent media really is unrecoverable then it's all themore critical that the pages be kept so the data can be copied to someother device. The last thing user space expects to happen is if thedata can't be written to persistent storage then also immediatelydelete it from RAM. (And the really last thing user space expects isfor this to happen and return no error.)Right. This implies though that apart from the kernel havingto keep around the dirtied-but-unrecoverable pages for anunbounded time, that there's further an interface for obtainingthe exact failed pages so that you can read them back. This inturn means that there needs to be an association between thefsync() caller and the specific dirtied pages that the callerintents to drain (for which we'd need an fsync_range(), amongother things). BTW, currently the failed writebacks are notdropped from memory, but rather marked clean. They could belost though due to memory pressure or due to explicit request(e.g. proc drop_caches), unless mlocked.There is a clear responsibility of the application to keepits buffers around until a successful fsync(). The kernelsdo report the error (albeit with all the complexities ofdealing with the interface), at which point the applicationmay not assume that the write()s where ever even bufferedin the kernel page cache in the first place.What you seem to be asking for is the capability of droppingbuffers over the (kernel) fence and idemnifying the applicationfrom any further responsibility, i.e. a hard assurancethat either the kernel will persist the pages or it willkeep them around till the application recovers themasynchronously, the filesystem is unmounted, or the systemis rebooted.[1] https://www.usenix.org/legacy/event/fast08/tech/full_papers/gunawi/gunawi.pdfFrom:Geoff Winkless <pgsqladmin(at)geoff(dot)dj>Date:2018-04-09 12:03:28On 9 April 2018 at 11:50, Anthony Iliopoulos wrote:What you seem to be asking for is the capability of droppingbuffers over the (kernel) fence and idemnifying the applicationfrom any further responsibility, i.e. a hard assurancethat either the kernel will persist the pages or it willkeep them around till the application recovers themasynchronously, the filesystem is unmounted, or the systemis rebooted.That seems like a perfectly reasonable position to take, frankly.The whole point of an Operating System should be that you can do exactlythat. As a developer I should be able to call write() and fsync() and knowthat if both calls have succeeded then the result is on disk, no matterwhat another application has done in the meantime. If that's a "difficult"problem then that's the OS's problem, not mine. If the OS doesn't do that,it's _not_doing_itsjob.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-09 12:16:38On 9 April 2018 at 18:50, Anthony Iliopoulos wrote:There is a clear responsibility of the application to keepits buffers around until a successful fsync(). The kernelsdo report the error (albeit with all the complexities ofdealing with the interface), at which point the applicationmay not assume that the write()s where ever even bufferedin the kernel page cache in the first place.What you seem to be asking for is the capability of droppingbuffers over the (kernel) fence and idemnifying the applicationfrom any further responsibility, i.e. a hard assurancethat either the kernel will persist the pages or it willkeep them around till the application recovers themasynchronously, the filesystem is unmounted, or the systemis rebooted.That's what Pg appears to assume now, yes.Whether that's reasonable is a whole different topic.I'd like a middle ground where the kernel lets us register our interest andtells us if it lost something, without us having to keep eight million FDsopen for some long period. "Tell us about anything that happens underpgdata/" or an inotify-style per-directory-registration option. I'd evensay that's ideal.In the mean time, I propose that we fsync() on close() before we age FDsout of the LRU on backends. Yes, that will hurt throughput and causestalls, but we don't seem to have many better options. At least it'll onlyflush what we actually wrote to the OS buffers not what we may have inshared_buffers. If the bgwriter does the same thing, we should be 100% safefrom this problem on 4.13+, and it'd be trivial to make it a GUC much likethe fsync or full_page_writes options that people can turn off if they knowthe risks / know their storage is safe / don't care.Some keen person who wants to later could optimise it by adding a fsyncworker thread pool in backends, so we don't block the main thread. Franklythat might be a nice thing to have in the checkpointer anyway. But it's outof scope for fixing this in durability terms.I'm partway through a patch that makes fsync panic on errors now. Oncethat's done, the next step will be to force fsync on close() in md and seehow we go with that.From:Anthony Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-04-09 12:31:27On Mon, Apr 09, 2018 at 01:03:28PM +0100, Geoff Winkless wrote:On 9 April 2018 at 11:50, Anthony Iliopoulos wrote:What you seem to be asking for is the capability of droppingbuffers over the (kernel) fence and idemnifying the applicationfrom any further responsibility, i.e. a hard assurancethat either the kernel will persist the pages or it willkeep them around till the application recovers themasynchronously, the filesystem is unmounted, or the systemis rebooted.That seems like a perfectly reasonable position to take, frankly.Indeed, as long as you are willing to ignore the consequences ofthis design decision: mainly, how you would recover memory when noapplication is interested in clearing the error. At which pointother applications with different priorities will find this positionrather unreasonable since there can be no way out of it for them.Good luck convincing any OS kernel upstream to go with this design.The whole point of an Operating System should be that you can do exactlythat. As a developer I should be able to call write() and fsync() and knowthat if both calls have succeeded then the result is on disk, no matterwhat another application has done in the meantime. If that's a "difficult"problem then that's the OS's problem, not mine. If the OS doesn't do that,it's _not_doing_itsjob.No OS kernel that I know of provides any promises for atomicity of awrite()+fsync() sequence, unless one is using O_SYNC. It doesn'tprovide you with isolation either, as this is delegated to userspace,where processes that share a file should coordinate accordingly.It's not a difficult problem, but rather the kernels provide a commondenominator of possible interfaces and designs that could accommodatea wider range of potential application scenarios for which the kernelcannot possibly anticipate requirements. There have been plenty ofexperimental works for providing a transactional (ACID) filesysteminterface to applications. On the opposite end, there have been quitea few commercial databases that completely bypass the kernel storagestack. But I would assume it is reasonable to figure out somethingbetween those two extremes that can work in a "portable" fashion.From:Anthony Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-04-09 12:54:16On Mon, Apr 09, 2018 at 08:16:38PM +0800, Craig Ringer wrote:I'd like a middle ground where the kernel lets us register our interest andtells us if it lost something, without us having to keep eight million FDsopen for some long period. "Tell us about anything that happens underpgdata/" or an inotify-style per-directory-registration option. I'd evensay that's ideal.I see what you are saying. So basically you'd always maintain thenotification descriptor open, where the kernel would inject eventsrelated to writeback failures of files under watch (potentiallyenriched to contain info regarding the exact failed pages andthe file offset they map to). The kernel wouldn't even have tomaintain per-page bits to trace the errors, since they will beconsumed by the process that reads the events (or discarded,when the notification fd is closed).Assuming this would be possible, wouldn't Pg still need to dealwith synchronizing writers and related issues (since this wouldbe merely a notification mechanism - not prevent any processfrom continuing), which I understand would be rather intrusivefor the current Pg multi-process design.But other than that, similarly this interface could in principlebe similarly implemented in the BSDs via kqueue(), I suppose,to provide what you need.From:Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>Date:2018-04-09 13:33:18On 04/09/2018 02:31 PM, Anthony Iliopoulos wrote:On Mon, Apr 09, 2018 at 01:03:28PM +0100, Geoff Winkless wrote:On 9 April 2018 at 11:50, Anthony Iliopoulos wrote:What you seem to be asking for is the capability of droppingbuffers over the (kernel) fence and idemnifying the applicationfrom any further responsibility, i.e. a hard assurancethat either the kernel will persist the pages or it willkeep them around till the application recovers themasynchronously, the filesystem is unmounted, or the systemis rebooted.That seems like a perfectly reasonable position to take, frankly.Indeed, as long as you are willing to ignore the consequences ofthis design decision: mainly, how you would recover memory when noapplication is interested in clearing the error. At which pointother applications with different priorities will find this positionrather unreasonable since there can be no way out of it for them.Sure, but the question is whether the system can reasonably operateafter some of the writes failed and the data got lost. Because if itcan't, then recovering the memory is rather useless. It might be betterto stop the system in that case, forcing the system administrator toresolve the issue somehow (fail-over to a replica, perform recovery fromthe last checkpoint, ...).We already have dirty_bytes and dirty_background_bytes, for example. Idon't see why there couldn't be another limit defining how much dirtydata to allow before blocking writes altogether. I'm sure it's not thatsimple, but you get the general idea - do not allow using all availablememory because of writeback issues, but don't throw the data away incase it's just a temporary issue.Good luck convincing any OS kernel upstream to go with this design.Well, there seem to be kernels that seem to do exactly that already. Atleast that's how I understand what this thread says about FreeBSD andIllumos, for example. So it's not an entirely insane design, apparently.The question is whether the current design makes it any easier foruser-space developers to build reliable systems. We have tried using it,and unfortunately the answers seems to be "no" and "Use direct I/O andmanage everything on your own!"The whole point of an Operating System should be that you can do exactlythat. As a developer I should be able to call write() and fsync() and knowthat if both calls have succeeded then the result is on disk, no matterwhat another application has done in the meantime. If that's a "difficult"problem then that's the OS's problem, not mine. If the OS doesn't do that,it's _not_doing_itsjob.No OS kernel that I know of provides any promises for atomicity of awrite()+fsync() sequence, unless one is using O_SYNC. It doesn'tprovide you with isolation either, as this is delegated to userspace,where processes that share a file should coordinate accordingly.We can (and do) take care of the atomicity and isolation. Implementationof those parts is obviously very application-specific, and we have WALand locks for that purpose. I/O on the other hand seems to be a genericservice provided by the OS - at least that's how we saw it until now.It's not a difficult problem, but rather the kernels provide a commondenominator of possible interfaces and designs that could accommodatea wider range of potential application scenarios for which the kernelcannot possibly anticipate requirements. There have been plenty ofexperimental works for providing a transactional (ACID) filesysteminterface to applications. On the opposite end, there have been quitea few commercial databases that completely bypass the kernel storagestack. But I would assume it is reasonable to figure out somethingbetween those two extremes that can work in a "portable" fashion.Users ask us about this quite often, actually. The question is usuallyabout "RAW devices" and performance, but ultimately it boils down tobuffered vs. direct I/O. So far our answer was we rely on kernel to dothis reliably, because they know how to do that correctly and we simplydon't have the manpower to implement it (portable, reliable, handlingdifferent types of storage, ...).One has to wonder how many applications actually use this correctly,considering PostgreSQL cares about data durability/consistency so muchand yet we've been misunderstanding how it works for 20+ years.From:Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>Date:2018-04-09 13:42:35On 04/09/2018 12:29 AM, Bruce Momjian wrote:An crazy idea would be to have a daemon that checks the logs andstops Postgres when it seems something wrong.That doesn't seem like a very practical way. It's better than nothing,of course, but I wonder how would that work with containers (where Ithink you may not have access to the kernel log at all). Also, I'mpretty sure the messages do change based on kernel version (and possiblyfilesystem) so parsing it reliably seems rather difficult. And weprobably don't want to PANIC after I/O error on an unrelated device, sowe'd need to understand which devices are related to PostgreSQL.From:Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>Date:2018-04-09 13:47:03At 2018-04-09 15:42:35 +0200, tomas(dot)vondra(at)2ndquadrant(dot)com wrote:On 04/09/2018 12:29 AM, Bruce Momjian wrote:An crazy idea would be to have a daemon that checks the logs andstops Postgres when it seems something wrong.That doesn't seem like a very practical way.Not least because Craig's tests showed that you can't rely on alwaysgetting an error message in the logs.From:Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>Date:2018-04-09 13:54:19On 04/09/2018 04:00 AM, Craig Ringer wrote:On 9 April 2018 at 07:16, Andres Freund <andres(at)anarazel(dot)deI think the danger presented here is far smaller than some of the statements in this thread might make one think.Clearly it's not happening a huge amount or we'd have a lot of noiseabout Pg eating people's data, people shouting about how unreliable itis, etc. We don't. So it's not some earth shattering imminent threat toeveryone's data. It's gone unnoticed, or the root cause unidentified,for a long time.Yeah, it clearly isn't the case that everything we do suddenly gotpointless. It's fairly annoying, though.I suspect we've written off a fair few issues in the past as "it'dbad hardware" when actually, the hardware fault was the trigger fora Pg/kernel interaction bug. And blamed containers for things thatweren't really the container's fault. But even so, if it werehappening tons, we'd hear more noise.Right. Write errors are fairly rare, and we've probably ignored a fairnumber of cases demonstrating this issue. It kinda reminds me the wisdom that not seeing planes with bullet holes in the engine does not meanengines don't need armor [1].[1]https://medium.com/@penguinpress/an-excerpt-from-how-not-to-be-wrong-by-jordan-ellenberg-664e708cfc3dFrom:Anthony Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-04-09 14:22:06On Mon, Apr 09, 2018 at 03:33:18PM +0200, Tomas Vondra wrote:We already have dirty_bytes and dirty_background_bytes, for example. Idon't see why there couldn't be another limit defining how much dirtydata to allow before blocking writes altogether. I'm sure it's not thatsimple, but you get the general idea - do not allow using all availablememory because of writeback issues, but don't throw the data away incase it's just a temporary issue.Sure, there could be knobs for limiting how much memory such "zombie"pages may occupy. Not sure how helpful it would be in the long runsince this tends to be highly application-specific, and for somethingwith a large data footprint one would end up tuning this accordinglyin a system-wide manner. This has the potential to leave otherapplications running in the same system with very little memory, incases where for example original application crashes and never clearsthe error. Apart from that, further interfaces would need to be providedfor actually dealing with the error (again assuming non-transientissues that may not be fixed transparently and that temporary issuesare taken care of by lower layers of the stack).Well, there seem to be kernels that seem to do exactly that already. Atleast that's how I understand what this thread says about FreeBSD andIllumos, for example. So it's not an entirely insane design, apparently.It is reasonable, but even FreeBSD has a big fat comment rightthere (since 2017), mentioning that there can be no recovery fromEIO at the block layer and this needs to be done differently. Noidea how an application running on top of either FreeBSD or Illumoswould actually recover from this error (and clear it out), otherthan remounting the fs in order to force dropping of relevant pages.It does provide though indeed a persistent error indication thatwould allow Pg to simply reliably panic. But again this does notnecessarily play well with other applications that may be usingthe filesystem reliably at the same time, and are now faced withEIO while their own writes succeed to be persisted.Ideally, you'd want a (potentially persistent) indication of errorlocalized to a file region (mapping the corresponding failed writebackpages). NetBSD is already implementing fsync_ranges(), which couldbe a step in the right direction.One has to wonder how many applications actually use this correctly,considering PostgreSQL cares about data durability/consistency so muchand yet we've been misunderstanding how it works for 20+ years.I would expect it would be very few, potentially those that havea very simple process model (e.g. embedded DBs that can abort atxn on fsync() EIO). I think that durability is a rather complexcross-layer issue which has been grossly misunderstood similarlyin the past (e.g. see [1]). It seems that both the OS and DBcommunities greatly benefit from a periodic reality check, andI see this as an opportunity for strengthening the IO stack inan end-to-end manner.[1] https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-pillai.pdfFrom:Greg Stark <stark(at)mit(dot)edu>Date:2018-04-09 15:29:36On 9 April 2018 at 15:22, Anthony Iliopoulos wrote:On Mon, Apr 09, 2018 at 03:33:18PM +0200, Tomas Vondra wrote:Sure, there could be knobs for limiting how much memory such "zombie"pages may occupy. Not sure how helpful it would be in the long runsince this tends to be highly application-specific, and for somethingwith a large data footprint one would end up tuning this accordinglyin a system-wide manner.Surely this is exactly what the kernel is there to manage. It has tocontrol how much memory is allowed to be full of dirty buffers in thefirst place to ensure that the system won't get memory starved if itcan't clean them fast enough. That isn't even about persistenthardware errors. Even when the hardware is working perfectly it canonly flush buffers so fast. The whole point of the kernel is toabstract away shared resources. It's not like user space has anybetter view of the situation here. If Postgres implemented all this inDIRECT_IO it would have exactly the same problem only with lessvisibility into what the rest of the system is doing. If everyapplication implemented its own buffer cache we would be back in thesame boat only with a fragmented memory allocation.This has the potential to leave otherapplications running in the same system with very little memory, incases where for example original application crashes and never clearsthe error.I still think we're speaking two different languages. There's noapplication anywhere that's going to "clear the error". Theapplication has done the writes and if it's calling fsync it wants towait until the filesystem can arrange for the write to be persisted.If the application could manage without the persistence then itwouldn't have called fsync.The only way to "clear out" the error would be by having the writessucceed. There's no reason to think that wouldn't be possiblesometime. The filesystem could remap blocks or an administrator couldreplace degraded raid device components. The only thing Postgres coulddo to recover would be create a new file and move the data (readingfrom the dirty buffer in memory!) to a new file anyways so we would"clear the error" by just no longer calling fsync on the old file.We always read fsync as a simple write barrier. That's what thedocumentation promised and it's what Postgres always expected. Itsounds like the kernel implementors looked at it as some kind ofcommunication channel to communicate status report for specific writesback to user-space. That's a much more complex problem and would haveentirely different interface. I think this is why we're having so muchdifficulty communicating.It is reasonable, but even FreeBSD has a big fat comment rightthere (since 2017), mentioning that there can be no recovery fromEIO at the block layer and this needs to be done differently. Noidea how an application running on top of either FreeBSD or Illumoswould actually recover from this error (and clear it out), otherthan remounting the fs in order to force dropping of relevant pages.It does provide though indeed a persistent error indication thatwould allow Pg to simply reliably panic. But again this does notnecessarily play well with other applications that may be usingthe filesystem reliably at the same time, and are now faced withEIO while their own writes succeed to be persisted.Well if they're writing to the same file that had a previous error Idoubt there are many applications that would be happy to considertheir writes "persisted" when the file was corrupt. Ironically theearlier discussion quoted talked about how applications that wantedmore granular communication would be using O_DIRECT -- but what wehave is fsync trying to be too granular such that it's impossible toget any strong guarantees about anything with it.One has to wonder how many applications actually use this correctly,considering PostgreSQL cares about data durability/consistency so muchand yet we've been misunderstanding how it works for 20+ years.I would expect it would be very few, potentially those that havea very simple process model (e.g. embedded DBs that can abort atxn on fsync() EIO).Honestly I don't think there's any way to use the current interfaceto implement reliable operation. Even that embedded database using asingle process and keeping every file open all the time (which meansfile descriptor limits limit its scalability) can be having silentcorruption whenever some other process like a backup program comesalong and calls fsync (or even sync?).From:Robert Haas <robertmhaas(at)gmail(dot)com>Date:2018-04-09 16:45:00On Mon, Apr 9, 2018 at 8:16 AM, Craig Ringer wrote:In the mean time, I propose that we fsync() on close() before we age FDs outof the LRU on backends. Yes, that will hurt throughput and cause stalls, butwe don't seem to have many better options. At least it'll only flush what weactually wrote to the OS buffers not what we may have in shared_buffers. Ifthe bgwriter does the same thing, we should be 100% safe from this problemon 4.13+, and it'd be trivial to make it a GUC much like the fsync orfull_page_writes options that people can turn off if they know the risks /know their storage is safe / don't care.Ouch. If a process exits -- say, because the user typed \q into psql-- then you're talking about potentially calling fsync() on a reallylarge number of file descriptor flushing many gigabytes of data todisk. And it may well be that you never actually wrote any data toany of those file descriptors -- those writes could have come fromother backends. Or you may have written a little bit of data throughthose FDs, but there could be lots of other data that you end upflushing incidentally. Perfectly innocuous things like starting up abackend, running a few short queries, and then having that backendexit suddenly turn into something that could have a massivesystem-wide performance impact.Also, if a backend ever manages to exit without running through thiscode, or writes any dirty blocks afterward, then this still fails tofix the problem completely. I guess that's probably avoidable -- wecan put this late in the shutdown sequence and PANIC if it fails.I have a really tough time believing this is the right way to solvethe problem. We suffered for years because of ext3's desire to flushthe entire page cache whenever any single file was fsync()'d, whichwas terrible. Eventually ext4 became the norm, and the problem wentaway. Now we're going to deliberately insert logic to do a verysimilar kind of terrible thing because the kernel developers havedecided that fsync() doesn't have to do what it says on the tin? Igrant that there doesn't seem to be a better option, but I bet we'regoing to have a lot of really unhappy users if we do this.From:"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>Date:2018-04-09 17:26:24On 04/09/2018 09:45 AM, Robert Haas wrote:On Mon, Apr 9, 2018 at 8:16 AM, Craig Ringer wrote:In the mean time, I propose that we fsync() on close() before we age FDs outof the LRU on backends. Yes, that will hurt throughput and cause stalls, butwe don't seem to have many better options. At least it'll only flush what weactually wrote to the OS buffers not what we may have in shared_buffers. Ifthe bgwriter does the same thing, we should be 100% safe from this problemon 4.13+, and it'd be trivial to make it a GUC much like the fsync orfull_page_writes options that people can turn off if they know the risks /know their storage is safe / don't care.I have a really tough time believing this is the right way to solvethe problem. We suffered for years because of ext3's desire to flushthe entire page cache whenever any single file was fsync()'d, whichwas terrible. Eventually ext4 became the norm, and the problem wentaway. Now we're going to deliberately insert logic to do a verysimilar kind of terrible thing because the kernel developers havedecided that fsync() doesn't have to do what it says on the tin? Igrant that there doesn't seem to be a better option, but I bet we'regoing to have a lot of really unhappy users if we do this.I don't have a better option but whatever we do, it should be an optional(GUC) change. We have plenty of YEARS of people not noticing this issue andRobert's correct, if we go back to an era of things like stalls it is goingto look bad on us no matter how we describe the problem.From:Gasper Zejn <zejn(at)owca(dot)info>Date:2018-04-09 18:02:21On 09. 04. 2018 15:42, Tomas Vondra wrote:On 04/09/2018 12:29 AM, Bruce Momjian wrote:An crazy idea would be to have a daemon that checks the logs andstops Postgres when it seems something wrong.That doesn't seem like a very practical way. It's better than nothing,of course, but I wonder how would that work with containers (where Ithink you may not have access to the kernel log at all). Also, I'mpretty sure the messages do change based on kernel version (and possiblyfilesystem) so parsing it reliably seems rather difficult. And weprobably don't want to PANIC after I/O error on an unrelated device, sowe'd need to understand which devices are related to PostgreSQL.regardsFor a bit less (or more) crazy idea, I'd imagine creating a Linux kernelmodule with kprobe/kretprobe capturing the file passed to fsync or evenbyte range within file and corresponding return value shouldn't be thathard. Kprobe has been a part of Linux kernel for a really long time, andfrom first glance it seems like it could be backported to 2.6 too.Then you could have stable log messages or implement some kind of "fsyncerror log notification" via whatever is the most sane way to get thisout of kernel.If the kernel is new enough and has eBPF support (seems like >=4.4),using bcc-tools[1] should enable you to write a quick script to getexactly that info via perf events[2].Obviously, that's a stopgap solution ...[1] https://github.com/iovisor/bcc[2]https://blog.yadutaf.fr/2016/03/30/turn-any-syscall-into-event-introducing-ebpf-kernel-probes/From:Mark Dilger <hornschnorter(at)gmail(dot)com>Date:2018-04-09 18:29:42On Apr 9, 2018, at 10:26 AM, Joshua D. Drake wrote:We have plenty of YEARS of people not noticing this issueI disagree. I have noticed this problem, but blamed it on other things.For over five years now, I have had to tell customers not to use thinprovisioning, and I have had to add code to postgres to refuse to performinserts or updates if the disk volume is more than 80% full. I have lostcount of the number of customers who are running an older version of theproduct (because they refuse to upgrade) and come back with complaints thatthey ran out of disk and now their database is corrupt. All this time, Ihave been blaming this on virtualization and thin provisioning.From:Robert Haas <robertmhaas(at)gmail(dot)com>Date:2018-04-09 19:02:11On Mon, Apr 9, 2018 at 12:45 PM, Robert Haas wrote:Ouch. If a process exits -- say, because the user typed \q into psql-- then you're talking about potentially calling fsync() on a reallylarge number of file descriptor flushing many gigabytes of data todisk. And it may well be that you never actually wrote any data toany of those file descriptors -- those writes could have come fromother backends. Or you may have written a little bit of data throughthose FDs, but there could be lots of other data that you end upflushing incidentally. Perfectly innocuous things like starting up abackend, running a few short queries, and then having that backendexit suddenly turn into something that could have a massivesystem-wide performance impact.Also, if a backend ever manages to exit without running through thiscode, or writes any dirty blocks afterward, then this still fails tofix the problem completely. I guess that's probably avoidable -- wecan put this late in the shutdown sequence and PANIC if it fails.I have a really tough time believing this is the right way to solvethe problem. We suffered for years because of ext3's desire to flushthe entire page cache whenever any single file was fsync()'d, whichwas terrible. Eventually ext4 became the norm, and the problem wentaway. Now we're going to deliberately insert logic to do a verysimilar kind of terrible thing because the kernel developers havedecided that fsync() doesn't have to do what it says on the tin? Igrant that there doesn't seem to be a better option, but I bet we'regoing to have a lot of really unhappy users if we do this.What about the bug we fixed inhttps://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=2ce439f3379aed857517c8ce207485655000fc8e? Say somebody does something along the lines of:ps uxww | grep postgres | grep -v grep | awk '{print $2}' | xargs kill -9...and then restarts postgres. Craig's proposal wouldn't cover thiscase, because there was no opportunity to run fsync() after the firstcrash, and there's now no way to go back and fsync() any stuff wedidn't fsync() before, because the kernel may have already thrown awaythe error state, or may lie to us and tell us everything is fine(because our new fd wasn't opened early enough). I can't find theoriginal discussion that led to that commit right now, so I'm notexactly sure what scenarios we were thinking about. But I think itwould at least be a problem if full_page_writes=off or if you hadpreviously started the server with fsync=off and now wish to switch tofsync=on after completing a bulk load or similar. Recovery can read apage, see that it looks OK, and continue, and then a later fsync()failure can revert that page to an earlier state and now your databaseis corrupted -- and there's absolute no way to detect this becausewrite() gives you the new page contents later, fsync() doesn't feelobliged to tell you about the error because your fd wasn't openedearly enough, and eventually the write can be discarded and you'llrevert back to the old page version with no errors ever being reportedanywhere.Another consequence of this behavior that initdb -S is never reliable,so pg_rewind's use of it doesn't actually fix the problem it wasintended to solve. It also means that initdb itself isn't crash-safe,since the data file changes are made by the backend but initdb itselfis doing the fsyncs, and initdb has no way of knowing what files thebackend is going to create and therefore can't -- even theoretically-- open them first.What's being presented to us as the API contract that we should expectfrom buffered I/O is that if you open a file and read() from it, callfsync(), and get no error, the kernel may nevertheless decide thatsome previous write that it never managed to flush can't be flushed,and then revert the page to the contents it had at some point in thepast. That's mostly or less equivalent to letting a maliciousadversary randomly overwrite database pages plausible-looking butincorrect contents without notice and hoping you can still build areliable system. You can avoid the problem if you can always open anfd for every file you want to modify before it's written and hold onto it until after it's fsync'd, but that's pretty hard to guarantee inthe face of kill -9.I think the simplest technological solution to this problem is torewrite the entire backend and all supporting processes to useO_DIRECT everywhere. To maintain adequate performance, we'll have towrite a complete I/O scheduling system inside PostgreSQL. Also, sincewe'll now have to make shared_buffers much larger -- since we'll nolonger be benefiting from the OS cache -- we'll need to replace theuse of malloc() with an allocator that pulls from shared_buffers.Plus, as noted, we'll need to totally rearchitect several of ourcritical frontend tools. Let's freeze all other development for thenext year while we work on that, and put out a notice that Linux is nolonger a supported platform for any existing release. Before we dothat, we might want to check whether fsync() actually writes the datato disk in a usable way even with O_DIRECT. If not, we should justde-support Linux entirely as a hopelessly broken and unsupportableplatform.From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-09 19:13:14Hi,On 2018-04-09 15:02:11 -0400, Robert Haas wrote:I think the simplest technological solution to this problem is torewrite the entire backend and all supporting processes to useO_DIRECT everywhere. To maintain adequate performance, we'll have towrite a complete I/O scheduling system inside PostgreSQL. Also, sincewe'll now have to make shared_buffers much larger -- since we'll nolonger be benefiting from the OS cache -- we'll need to replace theuse of malloc() with an allocator that pulls from shared_buffers.Plus, as noted, we'll need to totally rearchitect several of ourcritical frontend tools. Let's freeze all other development for thenext year while we work on that, and put out a notice that Linux is nolonger a supported platform for any existing release. Before we dothat, we might want to check whether fsync() actually writes the datato disk in a usable way even with O_DIRECT. If not, we should justde-support Linux entirely as a hopelessly broken and unsupportableplatform.Let's lower the pitchforks a bit here. Obviously a grand rewrite isabsurd, as is some of the proposed ways this is all supposed towork. But I think the case we're discussing is much closer to a nearirresolvable corner case than anything else.We're talking about the storage layer returning an irresolvableerror. You're hosed even if we report it properly. Yes, it'd be nice ifwe could report it reliably. But that doesn't change the fact that whatwe're doing is ensuring that data is safely fsynced unless storagefails, in which case it's not safely fsynced anyway.From:Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>Date:2018-04-09 19:22:58On 04/09/2018 08:29 PM, Mark Dilger wrote:On Apr 9, 2018, at 10:26 AM, Joshua D. Drake wrote:We have plenty of YEARS of people not noticing this issueI disagree. I have noticed this problem, but blamed it on other things.For over five years now, I have had to tell customers not to use thinprovisioning, and I have had to add code to postgres to refuse to performinserts or updates if the disk volume is more than 80% full. I have lostcount of the number of customers who are running an older version of theproduct (because they refuse to upgrade) and come back with complaints thatthey ran out of disk and now their database is corrupt. All this time, Ihave been blaming this on virtualization and thin provisioning.Yeah. There's a big difference between not noticing an issue because itdoes not happen very often vs. attributing it to something else. If wehad the ability to revisit past data corruption cases, we would probablydiscover a fair number of cases caused by this.The other thing we probably need to acknowledge is that the environmentchanges significantly - things like thin provisioning are likely to geteven more common, increasing the incidence of these issues.From:Peter Geoghegan <pg(at)bowt(dot)ie>Date:2018-04-09 19:25:33On Mon, Apr 9, 2018 at 12:13 PM, Andres Freund wrote:Let's lower the pitchforks a bit here. Obviously a grand rewrite isabsurd, as is some of the proposed ways this is all supposed towork. But I think the case we're discussing is much closer to a nearirresolvable corner case than anything else.+1We're talking about the storage layer returning an irresolvableerror. You're hosed even if we report it properly. Yes, it'd be nice ifwe could report it reliably. But that doesn't change the fact that whatwe're doing is ensuring that data is safely fsynced unless storagefails, in which case it's not safely fsynced anyway.Right. We seem to be implicitly assuming that there is a bigdifference between a problem in the storage layer that we could inprinciple detect, but don't, and any other problem in the storagelayer. I've read articles claiming that technologies like SMART arenot really reliable in a practical sense [1], so it seems to me thatthere is reason to doubt that this gap is all that big.That said, I suspect that the problems with running out of disk spaceare serious practical problems. I have personally scoffed at storiesinvolving Postgres databases corruption that gets attributed torunning out of disk space. Looks like I was dead wrong.[1] https://danluu.com/file-consistency/ -- "Filesystem correctness"From:Anthony Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-04-09 19:26:21On Mon, Apr 09, 2018 at 04:29:36PM +0100, Greg Stark wrote:Honestly I don't think there's any way to use the current interfaceto implement reliable operation. Even that embedded database using asingle process and keeping every file open all the time (which meansfile descriptor limits limit its scalability) can be having silentcorruption whenever some other process like a backup program comesalong and calls fsync (or even sync?).That is indeed true (sync would induce fsync on open inodes and clearthe error), and that's a nasty bug that apparently went unnoticed fora very long time. Hopefully the errseq_t linux 4.13 fixes deal with atleast this issue, but similar fixes need to be adopted by many otherkernels (all those that mark failed pages as clean).I honestly do not expect that keeping around the failed pages willbe an acceptable change for most kernels, and as such the recommendationwill probably be to coordinate in userspace for the fsync().What about having buffered IO with implied fsync() atomicity via O_SYNC?This would probably necessitate some helper threads that mask thelatency and present an async interface to the rest of PG, but soundsless intrusive than going for DIO.From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-09 19:29:16On 2018-04-09 21:26:21 +0200, Anthony Iliopoulos wrote:What about having buffered IO with implied fsync() atomicity via O_SYNC?You're kidding, right? We could also just add sleep(30)'s all over thetree, and hope that that'll solve the problem. There's a reason wedon't permanently fsync everything. Namely that it'll be way too slow.From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-09 19:37:03On April 9, 2018 12:26:21 PM PDT, Anthony Iliopoulos wrote:I honestly do not expect that keeping around the failed pages willbe an acceptable change for most kernels, and as such the recommendationwill probably be to coordinate in userspace for the fsync().Why is that required? You could very well just keep per inode information about fatal failures that occurred around. Report errors until that bit is explicitly cleared. Yes, that keeps some memory around until unmount if nobody clears it. But it's orders of magnitude less, and results in usable semantics.From:Justin Pryzby <pryzby(at)telsasoft(dot)com>Date:2018-04-09 19:41:19On Mon, Apr 09, 2018 at 09:31:56AM +0800, Craig Ringer wrote:You could make the argument that it's OK to forget if the entire filesystem goes away. But actually, why is that ok?I was going to say that it'd be okay to clear error flag on umount, since anyopened files would prevent unmounting; but, then I realized we need to considerthe case of close()ing all FDs then opening them later..in another process.I was going to say that's fine for postgres, since it chdir()s into itsbasedir, but actually not fine for nondefault tablespaces..On Mon, Apr 09, 2018 at 02:54:16PM +0200, Anthony Iliopoulos wrote:notification descriptor open, where the kernel would inject eventsrelated to writeback failures of files under watch (potentiallyenriched to contain info regarding the exact failed pages andthe file offset they map to).For postgres that'd require backend processes to open() an file such that,following its close(), any writeback errors are "signalled" to the checkpointerprocess...From:Anthony Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-04-09 19:44:31On Mon, Apr 09, 2018 at 12:29:16PM -0700, Andres Freund wrote:On 2018-04-09 21:26:21 +0200, Anthony Iliopoulos wrote:What about having buffered IO with implied fsync() atomicity via O_SYNC?You're kidding, right? We could also just add sleep(30)'s all over thetree, and hope that that'll solve the problem. There's a reason wedon't permanently fsync everything. Namely that it'll be way too slow.I am assuming you can apply the same principle of selectively using O_SYNCat times and places that you'd currently actually call fsync().Also assuming that you'd want to have a backwards-compatible solution forall those kernels that don't keep the pages around, irrespective of futurefixes. Short of loading a kernel module and dealing with the problem directly,the only other available options seem to be either O_SYNC, O_DIRECT or ignoringthe issue.From:Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>Date:2018-04-09 19:47:44On 04/09/2018 04:22 PM, Anthony Iliopoulos wrote:On Mon, Apr 09, 2018 at 03:33:18PM +0200, Tomas Vondra wrote:We already have dirty_bytes and dirty_background_bytes, for example. Idon't see why there couldn't be another limit defining how much dirtydata to allow before blocking writes altogether. I'm sure it's not thatsimple, but you get the general idea - do not allow using all availablememory because of writeback issues, but don't throw the data away incase it's just a temporary issue.Sure, there could be knobs for limiting how much memory such "zombie"pages may occupy. Not sure how helpful it would be in the long runsince this tends to be highly application-specific, and for somethingwith a large data footprint one would end up tuning this accordinglyin a system-wide manner. This has the potential to leave otherapplications running in the same system with very little memory, incases where for example original application crashes and never clearsthe error. Apart from that, further interfaces would need to be providedfor actually dealing with the error (again assuming non-transientissues that may not be fixed transparently and that temporary issuesare taken care of by lower layers of the stack).I don't quite see how this is any different from other possible issueswhen running multiple applications on the same system. One applicationcan generate a lot of dirty data, reaching dirty_bytes and forcing theother applications on the same host to do synchronous writes.Of course, you might argue that is a temporary condition - it willresolve itself once the dirty pages get written to storage. In case ofan I/O issue, it is a permanent impact - it will not resolve itselfunless the I/O problem gets fixed.Not sure what interfaces would need to be written? Possibly somethingthat says "drop dirty pages for these files" after the application getskilled or something. That makes sense, of course.Well, there seem to be kernels that seem to do exactly that already. Atleast that's how I understand what this thread says about FreeBSD andIllumos, for example. So it's not an entirely insane design, apparently.It is reasonable, but even FreeBSD has a big fat comment rightthere (since 2017), mentioning that there can be no recovery fromEIO at the block layer and this needs to be done differently. Noidea how an application running on top of either FreeBSD or Illumoswould actually recover from this error (and clear it out), otherthan remounting the fs in order to force dropping of relevant pages.It does provide though indeed a persistent error indication thatwould allow Pg to simply reliably panic. But again this does notnecessarily play well with other applications that may be usingthe filesystem reliably at the same time, and are now faced withEIO while their own writes succeed to be persisted.In my experience when you have a persistent I/O error on a device, itlikely affects all applications using that device. So unmounting the fsto clear the dirty pages seems like an acceptable solution to me.I don't see what else the application should do? In a way I'm suggestingapplications don't really want to be responsible for recovering (cleanupor dirty pages etc.). We're more than happy to hand that over to kernel,e.g. because each kernel will do that differently. What we however dowant is reliable information about fsync outcome, which we need toproperly manage WAL, checkpoints etc.Ideally, you'd want a (potentially persistent) indication of errorlocalized to a file region (mapping the corresponding failed writebackpages). NetBSD is already implementing fsync_ranges(), which couldbe a step in the right direction.One has to wonder how many applications actually use this correctly,considering PostgreSQL cares about data durability/consistency so muchand yet we've been misunderstanding how it works for 20+ years.I would expect it would be very few, potentially those that havea very simple process model (e.g. embedded DBs that can abort atxn on fsync() EIO). I think that durability is a rather complexcross-layer issue which has been grossly misunderstood similarlyin the past (e.g. see [1]). It seems that both the OS and DBcommunities greatly benefit from a periodic reality check, andI see this as an opportunity for strengthening the IO stack inan end-to-end manner.Right. What I was getting to is that perhaps the current fsync()behavior is not very practical for building actual applications.Best regards,Anthony[1] https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-pillai.pdfThanks. The paper looks interesting.From:Anthony Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-04-09 19:51:12On Mon, Apr 09, 2018 at 12:37:03PM -0700, Andres Freund wrote:On April 9, 2018 12:26:21 PM PDT, Anthony Iliopoulos wrote:I honestly do not expect that keeping around the failed pages willbe an acceptable change for most kernels, and as such the recommendationwill probably be to coordinate in userspace for the fsync().Why is that required? You could very well just keep per inode information about fatal failures that occurred around. Report errors until that bit is explicitly cleared. Yes, that keeps some memory around until unmount if nobody clears it. But it's orders of magnitude less, and results in usable semantics.As discussed before, I think this could be acceptable, especiallyif you pair it with an opt-in mechanism (only applications thatcare to deal with this will have to), and would give it a shot.Still need a way to deal with all other systems and prior kernelreleases that are eating fsync() writeback errors even over sync().From:Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>Date:2018-04-09 19:54:05On 04/09/2018 09:37 PM, Andres Freund wrote:On April 9, 2018 12:26:21 PM PDT, Anthony Iliopoulos wrote:I honestly do not expect that keeping around the failed pages willbe an acceptable change for most kernels, and as such the recommendationwill probably be to coordinate in userspace for the fsync().Why is that required? You could very well just keep per inodeinformation about fatal failures that occurred around. Report errorsuntil that bit is explicitly cleared. Yes, that keeps some memoryaround until unmount if nobody clears it. But it's orders ofmagnitude less, and results in usable semantics.Isn't the expectation that when a fsync call fails, the next one willretry writing the pages in the hope that it succeeds?Of course, it's also possible to do what you suggested, and simply markthe inode as failed. In which case the next fsync can't possibly retrythe writes (e.g. after freeing some space on thin-provisioned system),but we'd get reliable failure mode.From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-09 19:59:34On 2018-04-09 14:41:19 -0500, Justin Pryzby wrote:On Mon, Apr 09, 2018 at 09:31:56AM +0800, Craig Ringer wrote:You could make the argument that it's OK to forget if the entire filesystem goes away. But actually, why is that ok?I was going to say that it'd be okay to clear error flag on umount, since anyopened files would prevent unmounting; but, then I realized we need to considerthe case of close()ing all FDs then opening them later..in another process.On Mon, Apr 09, 2018 at 02:54:16PM +0200, Anthony Iliopoulos wrote:notification descriptor open, where the kernel would inject eventsrelated to writeback failures of files under watch (potentiallyenriched to contain info regarding the exact failed pages andthe file offset they map to).For postgres that'd require backend processes to open() an file such that,following its close(), any writeback errors are "signalled" to the checkpointerprocess...I don't think that's as hard as some people argued in this thread. Wecould very well open a pipe in postmaster with the write end open ineach subprocess, and the read end open only in checkpointer (andpostmaster, but unused there). Whenever closing a file descriptor thatwas dirtied in the current process, send it over the pipe to thecheckpointer. The checkpointer then can receive all those filedescriptors (making sure it's not above the limit, fsync(), close() ingto make room if necessary). The biggest complication would presumablybe to deduplicate the received filedescriptors for the same file,without loosing track of any errors.Even better, we could do so via a dedicated worker. That'd quitepossibly end up as a performance benefit.I was going to say that's fine for postgres, since it chdir()s into itsbasedir, but actually not fine for nondefault tablespaces..I think it'd be fair to open PG_VERSION of all createdtablespaces. Would require some hangups to signal checkpointer (orwhichever process) to do so when creating one, but it shouldn't be toohard. Some people would complain because they can't do some nasty hacksanymore, but it'd also save peoples butts by preventing them fromaccidentally unmounting.From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-09 20:04:20Hi,On 2018-04-09 21:54:05 +0200, Tomas Vondra wrote:Isn't the expectation that when a fsync call fails, the next one willretry writing the pages in the hope that it succeeds?Some people expect that, I personally don't think it's a usefulexpectation.We should just deal with this by crash-recovery. The big problem I seeis that you always need to keep an file descriptor open for pretty muchany file written to inside and outside of postgres, to be guaranteed tosee errors. And that'd solve that. Even if retrying would work, I'dadvocate for that (I've done so in the past, and I've written code in pgthat panics on fsync failure...).What we'd need to do however is to clear that bit during crashrecovery... Which is interesting from a policy perspective. Could bethat other apps wouldn't want that.I also wonder if we couldn't just somewhere read each relevant mountedfilesystem's errseq value. Whenever checkpointer notices beforefinishing a checkpoint that it has changed, do a crash restart.From:Mark Dilger <hornschnorter(at)gmail(dot)com>Date:2018-04-09 20:25:54On Apr 9, 2018, at 12:13 PM, Andres Freund wrote:Hi,On 2018-04-09 15:02:11 -0400, Robert Haas wrote:I think the simplest technological solution to this problem is torewrite the entire backend and all supporting processes to useO_DIRECT everywhere. To maintain adequate performance, we'll have towrite a complete I/O scheduling system inside PostgreSQL. Also, sincewe'll now have to make shared_buffers much larger -- since we'll nolonger be benefiting from the OS cache -- we'll need to replace theuse of malloc() with an allocator that pulls from shared_buffers.Plus, as noted, we'll need to totally rearchitect several of ourcritical frontend tools. Let's freeze all other development for thenext year while we work on that, and put out a notice that Linux is nolonger a supported platform for any existing release. Before we dothat, we might want to check whether fsync() actually writes the datato disk in a usable way even with O_DIRECT. If not, we should justde-support Linux entirely as a hopelessly broken and unsupportableplatform.Let's lower the pitchforks a bit here. Obviously a grand rewrite isabsurd, as is some of the proposed ways this is all supposed towork. But I think the case we're discussing is much closer to a nearirresolvable corner case than anything else.We're talking about the storage layer returning an irresolvableerror. You're hosed even if we report it properly. Yes, it'd be nice ifwe could report it reliably. But that doesn't change the fact that whatwe're doing is ensuring that data is safely fsynced unless storagefails, in which case it's not safely fsynced anyway.I was reading this thread up until now as meaning that the standby couldreceive corrupt WAL data and become corrupted. That seems a much biggerproblem than merely having the master become corrupted in some unrecoverableway. It is a long standing expectation that serious hardware problems onthe master can result in the master needing to be replaced. But there hasnot been an expectation that the one or more standby servers would be takendown along with the master, leaving all copies of the database unusable.If this bug corrupts the standby servers, too, then it is a whole differentclass of problem than the one folks have come to expect.Your comment reads as if this is a problem isolated to whichever server hasthe problem, and will not get propagated to other servers. Am I readingthat right?Can anybody clarify this for non-core-hacker folks following along at home?From:Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>Date:2018-04-09 20:30:00On 04/09/2018 10:04 PM, Andres Freund wrote:Hi,On 2018-04-09 21:54:05 +0200, Tomas Vondra wrote:Isn't the expectation that when a fsync call fails, the next one willretry writing the pages in the hope that it succeeds?Some people expect that, I personally don't think it's a usefulexpectation.Maybe. I'd certainly prefer automated recovery from an temporary I/Oissues (like full disk on thin-provisioning) without the databasecrashing and restarting. But I'm not sure it's worth the effort.And most importantly, it's rather delusional to think the kerneldevelopers are going to be enthusiastic about that approach ...We should just deal with this by crash-recovery. The big problem Isee is that you always need to keep an file descriptor open forpretty much any file written to inside and outside of postgres, to beguaranteed to see errors. And that'd solve that. Even if retryingwould work, I'd advocate for that (I've done so in the past, and I'vewritten code in pg that panics on fsync failure...).Sure. And it's likely way less invasive from kernel perspective.What we'd need to do however is to clear that bit during crashrecovery... Which is interesting from a policy perspective. Could bethat other apps wouldn't want that.IMHO it'd be enough if a remount clears it.I also wonder if we couldn't just somewhere read each relevantmounted filesystem's errseq value. Whenever checkpointer noticesbefore finishing a checkpoint that it has changed, do a crashrestart.Hmmmm, that's an interesting idea, and it's about the only thing thatwould help us on older kernels. There's a wb_err in adress_space, butthat's at inode level. Not sure if there's something at fs level.From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-09 20:34:15Hi,On 2018-04-09 13:25:54 -0700, Mark Dilger wrote:I was reading this thread up until now as meaning that the standby couldreceive corrupt WAL data and become corrupted.I don't see that as a real problem here. For one the problematicscenarios shouldn't readily apply, for another WAL is checksummed.There's the problem that a new basebackup would potentially becomecorrupted however. And similarly pg_rewind.Note that I'm not saying that we and/or linux shouldn't changeanything. Just that the apocalypse isn't here.Your comment reads as if this is a problem isolated to whichever server hasthe problem, and will not get propagated to other servers. Am I readingthat right?I think that's basically right. There's cases where corruption could getpropagated, but they're not straightforward.From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-09 20:37:31Hi,On 2018-04-09 22:30:00 +0200, Tomas Vondra wrote:Maybe. I'd certainly prefer automated recovery from an temporary I/Oissues (like full disk on thin-provisioning) without the databasecrashing and restarting. But I'm not sure it's worth the effort.Oh, I agree on that one. But that's more a question of how we force thekernel's hand on allocating disk space. In most cases the kernelallocates the disk space immediately, even if delayed allocation is ineffect. For the cases where that's not the case (if there are currentones, rather than just past bugs), we should be able to make sure that'snot an issue by pre-zeroing the data and/or using fallocate.From:Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>Date:2018-04-09 20:43:03On 04/09/2018 10:25 PM, Mark Dilger wrote:On Apr 9, 2018, at 12:13 PM, Andres Freund wrote:Hi,On 2018-04-09 15:02:11 -0400, Robert Haas wrote:I think the simplest technological solution to this problem is torewrite the entire backend and all supporting processes to useO_DIRECT everywhere. To maintain adequate performance, we'll have towrite a complete I/O scheduling system inside PostgreSQL. Also, sincewe'll now have to make shared_buffers much larger -- since we'll nolonger be benefiting from the OS cache -- we'll need to replace theuse of malloc() with an allocator that pulls from shared_buffers.Plus, as noted, we'll need to totally rearchitect several of ourcritical frontend tools. Let's freeze all other development for thenext year while we work on that, and put out a notice that Linux is nolonger a supported platform for any existing release. Before we dothat, we might want to check whether fsync() actually writes the datato disk in a usable way even with O_DIRECT. If not, we should justde-support Linux entirely as a hopelessly broken and unsupportableplatform.Let's lower the pitchforks a bit here. Obviously a grand rewrite isabsurd, as is some of the proposed ways this is all supposed towork. But I think the case we're discussing is much closer to a nearirresolvable corner case than anything else.We're talking about the storage layer returning an irresolvableerror. You're hosed even if we report it properly. Yes, it'd be nice ifwe could report it reliably. But that doesn't change the fact that whatwe're doing is ensuring that data is safely fsynced unless storagefails, in which case it's not safely fsynced anyway.I was reading this thread up until now as meaning that the standby couldreceive corrupt WAL data and become corrupted. That seems a much biggerproblem than merely having the master become corrupted in some unrecoverableway. It is a long standing expectation that serious hardware problems onthe master can result in the master needing to be replaced. But there hasnot been an expectation that the one or more standby servers would be takendown along with the master, leaving all copies of the database unusable.If this bug corrupts the standby servers, too, then it is a whole differentclass of problem than the one folks have come to expect.Your comment reads as if this is a problem isolated to whichever server hasthe problem, and will not get propagated to other servers. Am I readingthat right?Can anybody clarify this for non-core-hacker folks following along at home?That's a good question. I don't see any guarantee it'd be isolated tothe master node. Consider this example:(0) checkpoint happens on the primary(1) a page gets modified, a full-page gets written to WAL(2) the page is written out to page cache(3) writeback of that page fails (and gets discarded)(4) we attempt to modify the page again, but we read the stale version(5) we modify the stale version, writing the change to WALThe standby will get the full-page, and then a WAL from the stale pageversion. That doesn't seem like a story with a happy end, I guess. But Imight be easily missing some protection built into the WAL ...From:Mark Dilger <hornschnorter(at)gmail(dot)com>Date:2018-04-09 20:55:29On Apr 9, 2018, at 1:43 PM, Tomas Vondra wrote:On 04/09/2018 10:25 PM, Mark Dilger wrote:On Apr 9, 2018, at 12:13 PM, Andres Freund wrote:Hi,On 2018-04-09 15:02:11 -0400, Robert Haas wrote:I think the simplest technological solution to this problem is torewrite the entire backend and all supporting processes to useO_DIRECT everywhere. To maintain adequate performance, we'll have towrite a complete I/O scheduling system inside PostgreSQL. Also, sincewe'll now have to make shared_buffers much larger -- since we'll nolonger be benefiting from the OS cache -- we'll need to replace theuse of malloc() with an allocator that pulls from shared_buffers.Plus, as noted, we'll need to totally rearchitect several of ourcritical frontend tools. Let's freeze all other development for thenext year while we work on that, and put out a notice that Linux is nolonger a supported platform for any existing release. Before we dothat, we might want to check whether fsync() actually writes the datato disk in a usable way even with O_DIRECT. If not, we should justde-support Linux entirely as a hopelessly broken and unsupportableplatform.Let's lower the pitchforks a bit here. Obviously a grand rewrite isabsurd, as is some of the proposed ways this is all supposed towork. But I think the case we're discussing is much closer to a nearirresolvable corner case than anything else.We're talking about the storage layer returning an irresolvableerror. You're hosed even if we report it properly. Yes, it'd be nice ifwe could report it reliably. But that doesn't change the fact that whatwe're doing is ensuring that data is safely fsynced unless storagefails, in which case it's not safely fsynced anyway.I was reading this thread up until now as meaning that the standby couldreceive corrupt WAL data and become corrupted. That seems a much biggerproblem than merely having the master become corrupted in some unrecoverableway. It is a long standing expectation that serious hardware problems onthe master can result in the master needing to be replaced. But there hasnot been an expectation that the one or more standby servers would be takendown along with the master, leaving all copies of the database unusable.If this bug corrupts the standby servers, too, then it is a whole differentclass of problem than the one folks have come to expect.Your comment reads as if this is a problem isolated to whichever server hasthe problem, and will not get propagated to other servers. Am I readingthat right?Can anybody clarify this for non-core-hacker folks following along at home?That's a good question. I don't see any guarantee it'd be isolated tothe master node. Consider this example:(0) checkpoint happens on the primary(1) a page gets modified, a full-page gets written to WAL(2) the page is written out to page cache(3) writeback of that page fails (and gets discarded)(4) we attempt to modify the page again, but we read the stale version(5) we modify the stale version, writing the change to WALThe standby will get the full-page, and then a WAL from the stale pageversion. That doesn't seem like a story with a happy end, I guess. But Imight be easily missing some protection built into the WAL ...I can also imagine a master and standby that are similarly provisioned,and thus hit an out of disk error at around the same time, resulting incorruption on both, even if not the same corruption. When choosing tohave one standby, or two standbys, or ten standbys, one needs to be ableto assume a certain amount of statistical independence between failureson one server and failures on another. If they are tightly correlateddependent variables, then the conclusion that the probability of allnodes failing simultaneously is vanishingly small becomes invalid.From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-09 21:08:29Hi,On 2018-04-09 13:55:29 -0700, Mark Dilger wrote:I can also imagine a master and standby that are similarly provisioned,and thus hit an out of disk error at around the same time, resulting incorruption on both, even if not the same corruption.I think it's a grave mistake conflating ENOSPC issues (which we shouldsolve by making sure there's always enough space pre-allocated), withEIO type errors. The problem is different, the solution is different.From:Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>Date:2018-04-09 21:25:52On 04/09/2018 11:08 PM, Andres Freund wrote:Hi,On 2018-04-09 13:55:29 -0700, Mark Dilger wrote:I can also imagine a master and standby that are similarly provisioned,and thus hit an out of disk error at around the same time, resulting incorruption on both, even if not the same corruption.I think it's a grave mistake conflating ENOSPC issues (which we shouldsolve by making sure there's always enough space pre-allocated), withEIO type errors. The problem is different, the solution is different.In any case, that certainly does not count as data corruption spreadingfrom the master to standby.From:Mark Dilger <hornschnorter(at)gmail(dot)com>Date:2018-04-09 21:33:29On Apr 9, 2018, at 2:25 PM, Tomas Vondra wrote:On 04/09/2018 11:08 PM, Andres Freund wrote:Hi,On 2018-04-09 13:55:29 -0700, Mark Dilger wrote:I can also imagine a master and standby that are similarly provisioned,and thus hit an out of disk error at around the same time, resulting incorruption on both, even if not the same corruption.I think it's a grave mistake conflating ENOSPC issues (which we shouldsolve by making sure there's always enough space pre-allocated), withEIO type errors. The problem is different, the solution is different.I'm happy to take your word for that.In any case, that certainly does not count as data corruption spreadingfrom the master to standby.Maybe not from the point of view of somebody looking at the code. But auser might see it differently. If the data being loaded into the masterand getting replicated to the standby "causes" both to get corrupt, thenit seems like corruption spreading. I put "causes" in quotes because thereis some argument to be made about "correlation does not prove cause" and soforth, but it still feels like causation from an arms length perspective.If there is a pattern of standby servers tending to fail more often rightaround the time that the master fails, you'll have a hard time comfortingusers, "hey, it's not technically causation." If loading data into themaster causes the master to hit ENOSPC, and replicating that data to thestandby causes the standby to hit ENOSPC, and if the bug abound ENOSPC hasnot been fixed, then this looks like corruption spreading.I'm certainly planning on taking a hard look at the disk allocation on mystandby servers right soon now.From:Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>Date:2018-04-09 22:33:16On Tue, Apr 10, 2018 at 2:22 AM, Anthony Iliopoulos wrote:On Mon, Apr 09, 2018 at 03:33:18PM +0200, Tomas Vondra wrote:Well, there seem to be kernels that seem to do exactly that already. Atleast that's how I understand what this thread says about FreeBSD andIllumos, for example. So it's not an entirely insane design, apparently.It is reasonable, but even FreeBSD has a big fat comment rightthere (since 2017), mentioning that there can be no recovery fromEIO at the block layer and this needs to be done differently. Noidea how an application running on top of either FreeBSD or Illumoswould actually recover from this error (and clear it out), otherthan remounting the fs in order to force dropping of relevant pages.It does provide though indeed a persistent error indication thatwould allow Pg to simply reliably panic. But again this does notnecessarily play well with other applications that may be usingthe filesystem reliably at the same time, and are now faced withEIO while their own writes succeed to be persisted.Right. For anyone interested, here is the change you mentioned, andan interesting one that came a bit earlier last year:https://reviews.freebsd.org/rS316941 -- drop buffers after device goes awayhttps://reviews.freebsd.org/rS326029 -- update comment about EIO contractRetrying may well be futile, but at least future fsync() calls won'treport success bogusly. There may of course be more space-efficientways to represent that state as the comment implies, while never lyingto the user -- perhaps involving filesystem level or (pinned) inodelevel errors that stop all writes until unmounted. Something tells methey won't resort to flakey fsync() error reporting.I wonder if anyone can tell us what Windows, AIX and HPUX do here.[1] https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-pillai.pdfVery interesting, thanks.From:Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>Date:2018-04-10 00:32:20On Tue, Apr 10, 2018 at 10:33 AM, Thomas Munro wrote:I wonder if anyone can tell us what Windows, AIX and HPUX do here.I created a wiki page to track what we know (or think we know) about fsync() on various operating systems:https://wiki.postgresql.org/wiki/Fsync_ErrorsIf anyone has more information or sees mistakes, please go ahead and edit it.From:Andreas Karlsson <andreas(at)proxel(dot)se>Date:2018-04-10 00:41:10On 04/09/2018 02:16 PM, Craig Ringer wrote:I'd like a middle ground where the kernel lets us register our interestand tells us if it lost something, without us having to keep eightmillion FDs open for some long period. "Tell us about anything thathappens under pgdata/" or an inotify-style per-directory-registrationoption. I'd even say that's ideal.Could there be a risk of a race condition here where fsync incorrectlyreturns success before we get the notification of that something went wrong?From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-10 01:44:59On 10 April 2018 at 03:59, Andres Freund wrote:On 2018-04-09 14:41:19 -0500, Justin Pryzby wrote:On Mon, Apr 09, 2018 at 09:31:56AM +0800, Craig Ringer wrote:You could make the argument that it's OK to forget if the entire filesystem goes away. But actually, why is that ok?I was going to say that it'd be okay to clear error flag on umount, since anyopened files would prevent unmounting; but, then I realized we need to considerthe case of close()ing all FDs then opening them later..in another process.On Mon, Apr 09, 2018 at 02:54:16PM +0200, Anthony Iliopoulos wrote:notification descriptor open, where the kernel would inject eventsrelated to writeback failures of files under watch (potentiallyenriched to contain info regarding the exact failed pages andthe file offset they map to).For postgres that'd require backend processes to open() an file such that,following its close(), any writeback errors are "signalled" to the checkpointerprocess...I don't think that's as hard as some people argued in this thread. Wecould very well open a pipe in postmaster with the write end open ineach subprocess, and the read end open only in checkpointer (andpostmaster, but unused there). Whenever closing a file descriptor thatwas dirtied in the current process, send it over the pipe to thecheckpointer. The checkpointer then can receive all those filedescriptors (making sure it's not above the limit, fsync(), close() ingto make room if necessary). The biggest complication would presumablybe to deduplicate the received filedescriptors for the same file,without loosing track of any errors.Yep. That'd be a cheaper way to do it, though it wouldn't work onWindows. Though we don't know how Windows behaves here at all yet.Prior discussion upthread had the checkpointer open()ing a file at thesame time as a backend, before the backend writes to it. But passingthe fd when the backend is done with it would be better.We'd need a way to dup() the fd and pass it back to a backend when itneeded to reopen it sometimes, or just make sure to keep the oldestcopy of the fd when a backend reopens multiple times, but that's nobiggie.We'd still have to fsync() out early in the checkpointer if we ran outof space in our FD list, and initscripts would need to change ourulimit or we'd have to do it ourselves in the checkpointer. Butneither seems insurmountable.FWIW, I agree that this is a corner case, but it's getting to be apretty big corner with the spread of overcommitted, dedupliating SANs,cloud storage, etc. Not all I/O errors indicate permanent hardwarefaults, disk failures, etc, as I outlined earlier. I'm very curious toknow what AWS EBS's error semantics are, and other cloud network blockstores. (I posted on Amazon forumshttps://forums.aws.amazon.com/thread.jspa?threadID=279274&tstart=0 butnothing so far).I'm also not particularly inclined to trust that all file systems willalways reliably reserve space without having some cases where they'llfail writeback on space exhaustion.So we don't need to panic and freak out, but it's worth looking at thedirection the storage world is moving in, and whether this will becomea bigger issue over time.From:Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>Date:2018-04-10 01:52:21On Tue, Apr 10, 2018 at 1:44 PM, Craig Ringer wrote:On 10 April 2018 at 03:59, Andres Freund wrote:I don't think that's as hard as some people argued in this thread. Wecould very well open a pipe in postmaster with the write end open ineach subprocess, and the read end open only in checkpointer (andpostmaster, but unused there). Whenever closing a file descriptor thatwas dirtied in the current process, send it over the pipe to thecheckpointer. The checkpointer then can receive all those filedescriptors (making sure it's not above the limit, fsync(), close() ingto make room if necessary). The biggest complication would presumablybe to deduplicate the received filedescriptors for the same file,without loosing track of any errors.Yep. That'd be a cheaper way to do it, though it wouldn't work onWindows. Though we don't know how Windows behaves here at all yet.Prior discussion upthread had the checkpointer open()ing a file at thesame time as a backend, before the backend writes to it. But passingthe fd when the backend is done with it would be better.How would that interlock with concurrent checkpoints?I can see how to make that work if the share-fd-or-fsync-now logichappens in smgrwrite() when called by FlushBuffer() while you holdio_in_progress, but not if you defer it to some random time later.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-10 01:54:30On 10 April 2018 at 04:25, Mark Dilger wrote:I was reading this thread up until now as meaning that the standby couldreceive corrupt WAL data and become corrupted.Yes, it can, but not directly through the first error.What can happen is that we think a block got written when it didn't.If our in memory state diverges from our on disk state, we can makesubsequent WAL writes based on that wrong information. But that'sactually OK, since the standby will have replayed the original WALcorrectly.I think the only time we'd run into trouble is if we evict the good(but not written out) data from s_b and the fs buffer cache, thenlater read in the old version of a block we failed to overwrite. Datachecksums (if enabled) might catch it unless the write left the wholeblock stale. In that case we might generate a full page write with thestale block and propagate that over WAL to the standby.So I'd say standbys are relatively safe - very safe if the issue iscaught promptly, and less so over time. But AFAICS WAL-basedreplication (physical or logical) is not a perfect defense for this.However, remember, if your storage system is free of any sort ofoverprovisioning, is on a non-network file system, and doesn't usemultipath (or sets it up right) this issue is exceptionally unlikelyto affect you.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-10 01:59:03On 10 April 2018 at 04:37, Andres Freund wrote:Hi,On 2018-04-09 22:30:00 +0200, Tomas Vondra wrote:Maybe. I'd certainly prefer automated recovery from an temporary I/Oissues (like full disk on thin-provisioning) without the databasecrashing and restarting. But I'm not sure it's worth the effort.Oh, I agree on that one. But that's more a question of how we force thekernel's hand on allocating disk space. In most cases the kernelallocates the disk space immediately, even if delayed allocation is ineffect. For the cases where that's not the case (if there are currentones, rather than just past bugs), we should be able to make sure that'snot an issue by pre-zeroing the data and/or using fallocate.Nitpick: In most cases the kernel reserves disk space immediately,before returning from write(). NFS seems to be the main exceptionhere.EXT4 and XFS don't allocate until later, it by performing actualwrites to FS metadata, initializing disk blocks, etc. So we won'tnotice errors that are only detectable at actual time of allocation,like thin provisioning problems, until after write() returns and weface the same writeback issues.So I reckon you're safe from space-related issues if you're not on NFS(and whyyy would you do that?) and not thinly provisioned. I'm surethere are other corner cases, but I don't see any reason to expectspace-exhaustion-related corruption problems on a sensible FS backedby a sensible block device. I haven't tested things like quotas,verified how reliable space reservation is under concurrency, etc asyet.From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-10 02:00:59On April 9, 2018 6:59:03 PM PDT, Craig Ringer wrote:On 10 April 2018 at 04:37, Andres Freund wrote:Hi,On 2018-04-09 22:30:00 +0200, Tomas Vondra wrote:Maybe. I'd certainly prefer automated recovery from an temporary I/Oissues (like full disk on thin-provisioning) without the databasecrashing and restarting. But I'm not sure it's worth the effort.Oh, I agree on that one. But that's more a question of how we force thekernel's hand on allocating disk space. In most cases the kernelallocates the disk space immediately, even if delayed allocation is ineffect. For the cases where that's not the case (if there are currentones, rather than just past bugs), we should be able to make sure that'snot an issue by pre-zeroing the data and/or using fallocate.Nitpick: In most cases the kernel reserves disk space immediately,before returning from write(). NFS seems to be the main exceptionhere.EXT4 and XFS don't allocate until later, it by performing actualwrites to FS metadata, initializing disk blocks, etc. So we won'tnotice errors that are only detectable at actual time of allocation,like thin provisioning problems, until after write() returns and weface the same writeback issues.So I reckon you're safe from space-related issues if you're not on NFS(and whyyy would you do that?) and not thinly provisioned. I'm surethere are other corner cases, but I don't see any reason to expectspace-exhaustion-related corruption problems on a sensible FS backedby a sensible block device. I haven't tested things like quotas,verified how reliable space reservation is under concurrency, etc asyet.How's that not solved by pre zeroing and/or fallocate as I suggested above?From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-10 02:02:48On 10 April 2018 at 08:41, Andreas Karlsson wrote:On 04/09/2018 02:16 PM, Craig Ringer wrote:I'd like a middle ground where the kernel lets us register our interestand tells us if it lost something, without us having to keep eight millionFDs open for some long period. "Tell us about anything that happens underpgdata/" or an inotify-style per-directory-registration option. I'd even saythat's ideal.Could there be a risk of a race condition here where fsync incorrectlyreturns success before we get the notification of that something went wrong?We'd examine the notification queue only once all our checkpointfsync()s had succeeded, and before we updated the control file toadvance the redo position.I'm intrigued by the suggestion upthread of using a kprobe or similarto achieve this. It's a horrifying unportable hack that'd make kernelpeople cry, and I don't know if we have any way to flush bufferedprobe data to be sure we really get the news in time, but it's a coolidea too.From:Michael Paquier <michael(at)paquier(dot)xyz>Date:2018-04-10 05:04:13On Mon, Apr 09, 2018 at 03:02:11PM -0400, Robert Haas wrote:Another consequence of this behavior that initdb -S is never reliable,so pg_rewind's use of it doesn't actually fix the problem it wasintended to solve. It also means that initdb itself isn't crash-safe,since the data file changes are made by the backend but initdb itselfis doing the fsyncs, and initdb has no way of knowing what files thebackend is going to create and therefore can't -- even theoretically-- open them first.And pg_basebackup. And pg_dump. And pg_dumpall. Anything using initdb-S or fsync_pgdata would enter in those waters.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-10 05:37:19On 10 April 2018 at 13:04, Michael Paquier wrote:On Mon, Apr 09, 2018 at 03:02:11PM -0400, Robert Haas wrote:Another consequence of this behavior that initdb -S is never reliable,so pg_rewind's use of it doesn't actually fix the problem it wasintended to solve. It also means that initdb itself isn't crash-safe,since the data file changes are made by the backend but initdb itselfis doing the fsyncs, and initdb has no way of knowing what files thebackend is going to create and therefore can't -- even theoretically-- open them first.And pg_basebackup. And pg_dump. And pg_dumpall. Anything using initdb-S or fsync_pgdata would enter in those waters.... but only if they hit an I/O error or they're on a FS thatdoesn't reserve space and hit ENOSPC.It still does 99% of the job. It still flushes all buffers topersistent storage and maintains write ordering. It may not detect andreport failures to the user how we'd expect it to, yes, and that's notgreat. But it's hardly throw up our hands and give up territoryeither. Also, at least for initdb, we can make initdb fsync() its ownfiles before close(). Annoying but hardly the end of the world.From:Michael Paquier <michael(at)paquier(dot)xyz>Date:2018-04-10 06:10:21On Tue, Apr 10, 2018 at 01:37:19PM +0800, Craig Ringer wrote:On 10 April 2018 at 13:04, Michael Paquier wrote:And pg_basebackup. And pg_dump. And pg_dumpall. Anything using initdb-S or fsync_pgdata would enter in those waters.... but only if they hit an I/O error or they're on a FS thatdoesn't reserve space and hit ENOSPC.Sure.It still does 99% of the job. It still flushes all buffers topersistent storage and maintains write ordering. It may not detect andreport failures to the user how we'd expect it to, yes, and that's notgreat. But it's hardly throw up our hands and give up territoryeither. Also, at least for initdb, we can make initdb fsync() its ownfiles before close(). Annoying but hardly the end of the world.Well, I think that there is place for improving reporting of failurein file_utils.c for frontends, or at worst have an exit() for any kindof critical failures equivalent to a PANIC.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-10 12:15:15On 10 April 2018 at 14:10, Michael Paquier wrote:Well, I think that there is place for improving reporting of failurein file_utils.c for frontends, or at worst have an exit() for any kindof critical failures equivalent to a PANIC.Yup.In the mean time, speaking of PANIC, here's the first cut patch tomake Pg panic on fsync() failures. I need to do some closer review andtesting, but it's presented here for anyone interested.I intentionally left some failures as ERROR not PANIC, where theentire operation is done as a unit, and an ERROR will cause us toretry the whole thing.For example, when we fsync() a temp file before we move it into place,there's no point panicing on failure, because we'll discard the tempfile on ERROR and retry the whole thing.I've verified that it works as expected with some modifications to thetest tool I've been using (pushed).The main downside is that if we panic in redo, we don't try again. Wethrow our toys and shut down. But arguably if we get the same I/Oerror again in redo, that's the right thing to do anyway, and quitelikely safer than continuing to ERROR on checkpoints indefinitely.Patch attached.To be clear, this patch only deals with the issue of us retryingfsyncs when it turns out to be unsafe. This does NOT address any ofthe issues where we won't find out about writeback errors at all.AttachmentContent-TypeSizev1-0001-PANIC-when-we-detect-a-possible-fsync-I-O-error-i.patchtext/x-patch10.3 KBFrom:Robert Haas <robertmhaas(at)gmail(dot)com>Date:2018-04-10 15:15:46On Mon, Apr 9, 2018 at 3:13 PM, Andres Freund wrote:Let's lower the pitchforks a bit here. Obviously a grand rewrite isabsurd, as is some of the proposed ways this is all supposed towork. But I think the case we're discussing is much closer to a nearirresolvable corner case than anything else.Well, I admit that I wasn't entirely serious about that email, but Iwasn't entirely not-serious either. If you can't find reliably findout whether the contents of the file on disk are the same as thecontents that the kernel is giving you when you call read(), then youare going to have a heck of a time building a reliable system. If thekernel developers are determined to insist on these semantics (and,admittedly, I don't know whether that's the case - I've only readAnthony's remarks), then I don't really see what we can do except giveup on buffered I/O (or on Linux).We're talking about the storage layer returning an irresolvableerror. You're hosed even if we report it properly. Yes, it'd be nice ifwe could report it reliably. But that doesn't change the fact that whatwe're doing is ensuring that data is safely fsynced unless storagefails, in which case it's not safely fsynced anyway.I think that reliable error reporting is more than "nice" -- I thinkit's essential. The only argument for the current Linux behavior thathas been so far advanced on this thread, at least as far as I can see,is that if it kept retrying the buffers forever, it would be pointlessand might run the machine out of memory, so we might as well discardthem. But previous comments have already illustrated that the kernelis not really up against a wall there -- it could put individualinodes into a permanent failure state when it discards their dirtydata, as you suggested, or it could do what others have suggested, andwhat I think is better, which is to put the whole filesystem into apermanent failure state that can be cleared by remounting the FS.That could be done on an as-needed basis -- if the number of dirtybuffers you're holding onto for some filesystem becomes too large, putthe filesystem into infinite-fail mode and discard them all. Thatbehavior would be pretty easy for administrators to understand andwould resolve the entire problem here provided that no PostgreSQLprocesses survived the eventual remount.I also don't really know what we mean by an "unresolvable" error. Ifthe drive is beyond all hope, then it doesn't really make sense totalk about whether the database stored on it is corrupt. In generalwe can't be sure that we'll even get an error - e.g. the system couldbe idle and the drive could be on fire. Maybe this is the case youmeant by "it'd be nice if we could report it reliably". But at leastin my experience, that's typically not what's going on. You get someI/O errors and so you remount the filesystem, or reboot, or rebuildthe array, or ... something. And then the errors go away and, at thatpoint, you want to run recovery and continue using your database. Inthis scenario, it matters quite a bit what the error reporting waslike during the period when failures were occurring. In particular,if the database was allowed to think that it had successfullycheckpointed when it didn't, you're going to start recovery from thewrong place.I'm going to shut up now because I'm telling you things that youobviously already know, but this doesn't sound like a "nearirresolvable corner case". When the storage goes bonkers, eitherPostgreSQL and the kernel can interact in such a way that a checkpointcan succeed without all of the relevant data getting persisted, orthey don't. It sounds like right now they do, and I'm not reallyclear that we have a reasonable idea how to fix that. It does notsound like a PANIC is sufficient.From:Robert Haas <robertmhaas(at)gmail(dot)com>Date:2018-04-10 15:28:07On Tue, Apr 10, 2018 at 1:37 AM, Craig Ringer wrote:... but only if they hit an I/O error or they're on a FS thatdoesn't reserve space and hit ENOSPC.It still does 99% of the job. It still flushes all buffers topersistent storage and maintains write ordering. It may not detect andreport failures to the user how we'd expect it to, yes, and that's notgreat. But it's hardly throw up our hands and give up territoryeither. Also, at least for initdb, we can make initdb fsync() its ownfiles before close(). Annoying but hardly the end of the world.I think we'd need every child postgres process started by initdb to dothat individually, which I suspect would slow down initdb quite a lot.Now admittedly for anybody other than a PostgreSQL developer that'sonly a minor issue, and our regression tests mostly run with fsync=offanyway. But I have a strong suspicion that our assumptions about howfsync() reports errors are baked into an awful lot of parts of thesystem, and by the time we get unbaking them I think it's going to bereally surprising if we haven't done real harm to overall systemperformance.BTW, I took a look at the MariaDB source code to see whether they'vegot this problem too and it sure looks like they do.os_file_fsync_posix() retries the fsync in a loop with an 0.2 secondsleep after each retry. It warns after 100 failures and fails anassertion after 1000 failures. It is hard to understand why theywould have written the code this way unless they expect errorsreported by fsync() to continue being reported until the underlyingcondition is corrected. But, it looks like they wouldn't have theproblem that we do with trying to reopen files to fsync() them later-- I spot checked a few places where this code is invoked and in allof those it looks like the file is already expected to be open.From:Anthony Iliopoulos <ailiop(at)altatus(dot)com>Date:2018-04-10 15:40:05Hi Robert,On Tue, Apr 10, 2018 at 11:15:46AM -0400, Robert Haas wrote:On Mon, Apr 9, 2018 at 3:13 PM, Andres Freund wrote:Let's lower the pitchforks a bit here. Obviously a grand rewrite isabsurd, as is some of the proposed ways this is all supposed towork. But I think the case we're discussing is much closer to a nearirresolvable corner case than anything else.Well, I admit that I wasn't entirely serious about that email, but Iwasn't entirely not-serious either. If you can't find reliably findout whether the contents of the file on disk are the same as thecontents that the kernel is giving you when you call read(), then youare going to have a heck of a time building a reliable system. If thekernel developers are determined to insist on these semantics (and,admittedly, I don't know whether that's the case - I've only readAnthony's remarks), then I don't really see what we can do except giveup on buffered I/O (or on Linux).I think it would be interesting to get in touch with some of therespective linux kernel maintainers and open up this topic formore detailed discussions. LSF/MM'18 is upcoming and it wouldhave been the perfect opportunity but it's past the CFP deadline.It may still worth contacting the organizers to bring forwardthe issue, and see if there is a chance to have someone fromPg invited for further discussions.From:Greg Stark <stark(at)mit(dot)edu>Date:2018-04-10 16:38:27On 9 April 2018 at 11:50, Anthony Iliopoulos wrote:On Mon, Apr 09, 2018 at 09:45:40AM +0100, Greg Stark wrote:On 8 April 2018 at 22:47, Anthony Iliopoulos wrote:To make things a bit simpler, let us focus on EIO for the moment.The contract between the block layer and the filesystem layer isassumed to be that of, when an EIO is propagated up to the fs,then you may assume that all possibilities for recovering havebeen exhausted in lower layers of the stack.Well Postgres is using the filesystem. The interface between the blocklayer and the filesystem may indeed need to be more complex, Iwouldn't know.But I don't think "all possibilities" is a very useful concept.Neither layer here is going to be perfect. They can only promise thatall possibilities that have actually been implemented have beenexhausted. And even among those only to the degree they can be doneautomatically within the engineering tradeoffs and constraints. Therewill always be cases like thin provisioned devices that an operatorcan expand, or degraded raid arrays that can be repaired after a longoperation and so on. A network device can't be sure whether a remoteserver may eventually come back or not and have to be reconfigured bya human or system automation tool to point to the new server or newnetwork configuration.Right. This implies though that apart from the kernel havingto keep around the dirtied-but-unrecoverable pages for anunbounded time, that there's further an interface for obtainingthe exact failed pages so that you can read them back.No, the interface we have is fsync which gives us that informationwith the granularity of a single file. The database could in theoryrecognize that fsync is not completing on a file and read that fileback and write it to a new file. More likely we would implement afeature Oracle has of writing key files to multiple devices. Butcurrently in practice that's not what would happen, what would happenwould be a human would recognize that the database has stopped beingable to commit and there are hardware errors in the log and would stopthe database, take a backup, and restore onto a new working device.The current interface is that there's one error and then Postgreswould pretty much have to say, "sorry, your database is corrupt andthe data is gone, restore from your backups". Which is pretty dismal.There is a clear responsibility of the application to keepits buffers around until a successful fsync(). The kernelsdo report the error (albeit with all the complexities ofdealing with the interface), at which point the applicationmay not assume that the write()s where ever even bufferedin the kernel page cache in the first place.Postgres cannot just store the entire database in RAM. It writesthings to the filesystem all the time. It calls fsync only when itneeds a write barrier to ensure consistency. That's only frequent onthe transaction log to ensure it's flushed before data modificationsand then periodically to checkpoint the data files. The amount of datawritten between checkpoints can be arbitrarily large and Postgres hasno idea how much memory is available as filesystem buffers or how muchi/o bandwidth is available or other memory pressure there is. Whatyou're suggesting is that the application should have to babysit thefilesystem buffer cache and reimplement all of it in user-spacebecause the filesystem is free to throw away any data any time itchooses?The current interface to throw away filesystem buffer cache isunmount. It sounds like the kernel would like a more granular way todiscard just part of a device which makes a lot of sense in the age oflarge network block devices. But I don't think just saying that thefilesystem buffer cache is now something every application needs tore-implement in user-space really helps with that, they're going tohave the same problems to solve.From:Greg Stark <stark(at)mit(dot)edu>Date:2018-04-10 16:54:40On 10 April 2018 at 02:59, Craig Ringer wrote:Nitpick: In most cases the kernel reserves disk space immediately,before returning from write(). NFS seems to be the main exceptionhere.I'm kind of puzzled by this. Surely NFS servers store the data in thefilesystem using write(2) or the in-kernel equivalent? So if theserver is backed by a filesystem where write(2) preallocates spacesurely the NFS server must behave as if it'spreallocating as well? Iwould expect NFS to provide basically the same set of possiblefailures as the underlying filesystem (as long as you don't enablenosync of course).From:"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>Date:2018-04-10 18:58:37-hackers,I reached out to the Linux ext4 devs, here is tytso(at)mit(dot)edu response:"""Hi Joshua,This isn't actually an ext4 issue, but a long-standing VFS/MM issue.There are going to be multiple opinions about what the right thing todo. I'll try to give as unbiased a description as possible, butcertainly some of this is going to be filtered by my own biases nomatter how careful I can be.First of all, what storage devices will do when they hit an exceptioncondition is quite non-deterministic. For example, the vast majorityof SSD's are not power fail certified. What this means is that ifthey suffer a power drop while they are doing a GC, it is quitepossible for data written six months ago to be lost as a result. TheLBA could potentialy be far, far away from any LBA's that wererecently written, and there could have been multiple CACHE FLUSHoperations in the since the LBA in question was last written sixmonths ago. No matter; for a consumer-grade SSD, it's possible forthat LBA to be trashed after an unexpected power drop.Which is why after a while, one can get quite paranoid and assume thatthe only way you can guarantee data robustness is to store multiplecopies and/or use erasure encoding, with some of the copies or shardswritten to geographically diverse data centers.Secondly, I think it's fair to say that the vast majority of thecompanies who require data robustness, and are either willing to pay$$$ to an enterprise distro company like Red Hat, or command a largeenough paying customer base that they can afford to dictate terms toan enterprise distro, or hire a consultant such as Christoph, or havetheir own staffed Linux kernel teams, have tended to use O_DIRECT. Sofor better or for worse, there has not been as much investment inbuffered I/O and data robustness in the face of exception handling ofstorage devices.Next, the reason why fsync() has the behaviour that it does is oneofhe the most common cases of I/O storage errors in buffered usecases, certainly as seen by the community distros, is the user whopulls out USB stick while it is in use. In that case, if there aredirtied pages in the page cache, the question is what can you do?Sooner or later the writes will time out, and if you leave the pagesdirty, then it effectively becomes a permanent memory leak. You can'tunmount the file system --- that requires writing out all of the pagessuch that the dirty bit is turned off. And if you don't clear thedirty bit on an I/O error, then they can never be cleaned. You can'teven re-insert the USB stick; the re-inserted USB stick will get a newblock device. Worse, when the USB stick was pulled, it will havesuffered a power drop, and see above about what could happen after apower drop for non-power fail certified flash devices --- it goesdouble for the cheap sh*t USB sticks found in the checkout aisle ofMicro Center.So this is the explanation for why Linux handles I/O errors byclearing the dirty bit after reporting the error up to user space.And why there is not eagerness to solve the problem simply by "don'tclear the dirty bit". For every one Postgres installation that mighthave a better recover after an I/O error, there's probably a thousandclueless Fedora and Ubuntu users who will have a much worse userexperience after a USB stick pull happens.I can think of things that could be done --- for example, it could beswitchable on a per-block device basis (or maybe a per-mount basis)whether or not the dirty bit gets cleared after the error is reportedto userspace. And perhaps there could be a new unmount flag thatcauses all dirty pages to be wiped out, which could be used to recoverafter a permanent loss of the block device. But the question is whois going to invest the time to make these changes? If there is acompany who is willing to pay to comission this work, it's almostcertainly soluble. Or if a company which has a kernel on staff iswilling to direct an engineer to work on it, it certainly could besolved. But again, of the companies who have client code where wecare about robustness and proper handling of failed disk drives, andwhich have a kernel team on staff, pretty much all of the ones I canthink of (e.g., Oracle, Google, etc.) use O_DIRECT and they don't tryto make buffered writes and error reporting via fsync(2) work well.In general these companies want low-level control over buffer cacheeviction algorithms, which drives them towards the design decision ofeffectively implementing the page cache in userspace, and usingO_DIRECT reads/writes.If you are aware of a company who is willing to pay to have a newkernel feature implemented to meet your needs, we might be able torefer you to a company or a consultant who might be able to do thatwork. Let me know off-line if that's the case...- Ted"""From:"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>Date:2018-04-10 19:51:01-hackers,The thread is picking up over on the ext4 list. They don't update theirarchives as often as we do, so I can't link to the discussion. Whatwould be the preferred method of sharing the info?Thanks,From:"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>Date:2018-04-10 20:57:34On 04/10/2018 12:51 PM, Joshua D. Drake wrote:-hackers,The thread is picking up over on the ext4 list. They don't update theirarchives as often as we do, so I can't link to the discussion. Whatwould be the preferred method of sharing the info?Thanks to Anthony for this link:http://lists.openwall.net/linux-ext4/2018/04/10/33It isn't quite real time but it keeps things close enough.From:Jonathan Corbet <corbet(at)lwn(dot)net>Date:2018-04-11 12:05:27On Tue, 10 Apr 2018 17:40:05 +0200 Anthony Iliopoulos wrote:LSF/MM'18 is upcoming and it wouldhave been the perfect opportunity but it's past the CFP deadline.It may still worth contacting the organizers to bring forwardthe issue, and see if there is a chance to have someone fromPg invited for further discussions.FWIW, it is my current intention to be sure that the developmentcommunity is at least aware of the issue by the time LSFMM starts.The event is April 23-25 in Park City, Utah. I bet that room could befound for somebody from the postgresql community, should there besomebody who would like to represent the group on this issue. Let meknow if an introduction or advocacy from my direction would be helpful.From:Greg Stark <stark(at)mit(dot)edu>Date:2018-04-11 12:23:49On 10 April 2018 at 19:58, Joshua D. Drake wrote:You can't unmount the file system --- that requires writing out all of the pagessuch that the dirty bit is turned off.I always wondered why Linux didn't implement umount -f. It's been inBSD since forever and it's a major annoyance that it's missing inLinux. Even without leaking memory it still leaks other resources,causes confusion and awkward workarounds in UI and automationsoftware.From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-11 14:29:09Hi,On 2018-04-11 06:05:27 -0600, Jonathan Corbet wrote:The event is April 23-25 in Park City, Utah. I bet that room could befound for somebody from the postgresql community, should there besomebody who would like to represent the group on this issue. Let meknow if an introduction or advocacy from my direction would be helpful.If that room can be found, I might be able to make it. Being in SF, I'mprobably the physically closest PG dev involved in the discussion.Thanks for chiming in,From:Jonathan Corbet <corbet(at)lwn(dot)net>Date:2018-04-11 14:40:31On Wed, 11 Apr 2018 07:29:09 -0700 Andres Freund wrote:If that room can be found, I might be able to make it. Being in SF, I'mprobably the physically closest PG dev involved in the discussion.OK, I've dropped the PC a note; hopefully you'll be hearing from them.From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-17 21:19:53On Tue, Apr 10, 2018 at 05:54:40PM +0100, Greg Stark wrote:On 10 April 2018 at 02:59, Craig Ringer wrote:Nitpick: In most cases the kernel reserves disk space immediately,before returning from write(). NFS seems to be the main exceptionhere.I'm kind of puzzled by this. Surely NFS servers store the data in thefilesystem using write(2) or the in-kernel equivalent? So if theserver is backed by a filesystem where write(2) preallocates spacesurely the NFS server must behave as if it'spreallocating as well? Iwould expect NFS to provide basically the same set of possiblefailures as the underlying filesystem (as long as you don't enablenosync of course).I don't think the write is sent to the NFS at the time of the write,so while the NFS side would reserve the space, it might get the writerequest until after we return write success to the process.From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-17 21:29:17On Mon, Apr 9, 2018 at 03:42:35PM +0200, Tomas Vondra wrote:On 04/09/2018 12:29 AM, Bruce Momjian wrote:An crazy idea would be to have a daemon that checks the logs andstops Postgres when it seems something wrong.That doesn't seem like a very practical way. It's better than nothing,of course, but I wonder how would that work with containers (where Ithink you may not have access to the kernel log at all). Also, I'mpretty sure the messages do change based on kernel version (and possiblyfilesystem) so parsing it reliably seems rather difficult. And weprobably don't want to PANIC after I/O error on an unrelated device, sowe'd need to understand which devices are related to PostgreSQL.My more-considered crazy idea is to have a postgresql.conf setting likearchive_command that allows the administrator to specify a command thatwill be run after fsync but before the checkpoint is marked ascomplete. While we can have write flush errors before fsync and neversee the errors during fsync, we will not have write flush errors afterfsync that are associated with previous writes.The script should check for I/O or space-exhaustion errors and returnfalse in that case, in which case we can stop and maybe stop and crashrecover. We could have an exit of 1 do the former, and an exit of 2 dothe later.Also, if we are relying on WAL, we have to make sure WAL is actuallysafe with fsync, and I am betting only the O_DIRECT methods actuallyare safe: #wal_sync_method = fsync # the default is the first option # supported by the operating system: # open_datasync --> # fdatasync (default on Linux) --> # fsync --> # fsync_writethrough # open_syncI am betting the marked wal_sync_method methods are not safe since thereis time between the write and fsync.From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-17 21:32:45On Mon, Apr 9, 2018 at 03:42:35PM +0200, Tomas Vondra wrote:On 04/09/2018 12:29 AM, Bruce Momjian wrote:An crazy idea would be to have a daemon that checks the logs andstops Postgres when it seems something wrong.That doesn't seem like a very practical way. It's better than nothing,of course, but I wonder how would that work with containers (where Ithink you may not have access to the kernel log at all). Also, I'mpretty sure the messages do change based on kernel version (and possiblyfilesystem) so parsing it reliably seems rather difficult. And weprobably don't want to PANIC after I/O error on an unrelated device, sowe'd need to understand which devices are related to PostgreSQL.Replying to your specific case, I am not sure how we would use a scriptto check for I/O errors/space-exhaustion if the postgres user doesn'thave access to it. Does O_DIRECT work in such container cases?From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-17 21:34:53On 2018-04-17 17:29:17 -0400, Bruce Momjian wrote:Also, if we are relying on WAL, we have to make sure WAL is actuallysafe with fsync, and I am betting only the O_DIRECT methods actuallyare safe: > #wal_sync_method = fsync # the default is the first option > # supported by the operating system: > # open_datasync > --> # fdatasync (default on Linux) > --> # fsync > --> # fsync_writethrough > # open_syncI am betting the marked wal_sync_method methods are not safe since thereis time between the write and fsync.Hm? That's not really the issue though? One issue is that retries arenot necessarily safe in buffered IO, the other that fsync might notreport an error if the fd was closed and opened.O_DIRECT is only used if wal archiving or streaming isn't used, whichmakes it pretty useless anyway.From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-17 21:41:42On 2018-04-17 17:32:45 -0400, Bruce Momjian wrote:On Mon, Apr 9, 2018 at 03:42:35PM +0200, Tomas Vondra wrote:That doesn't seem like a very practical way. It's better than nothing,of course, but I wonder how would that work with containers (where Ithink you may not have access to the kernel log at all). Also, I'mpretty sure the messages do change based on kernel version (and possiblyfilesystem) so parsing it reliably seems rather difficult. And weprobably don't want to PANIC after I/O error on an unrelated device, sowe'd need to understand which devices are related to PostgreSQL.You can certainly have access to the kernel log in containers. I'dassume such a script wouldn't check various system logs but instead tail/dev/kmsg or such. Otherwise the variance between installations would betoo big.There's not that many different type of error messages and they don'tchange that often. If we'd just detect error for the most common FSswe'd probably be good. Detecting a few general storage layer messagewouldn't be that hard either, most things have been unified over thelast ~8-10 years.Replying to your specific case, I am not sure how we would use a scriptto check for I/O errors/space-exhaustion if the postgres user doesn'thave access to it.Not sure what you mean?Space exhaustiion can be checked when allocating space, FWIW. We'd justneed to use posix_fallocate et al.Does O_DIRECT work in such container cases?Yes.From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-17 21:49:42On Mon, Apr 9, 2018 at 12:25:33PM -0700, Peter Geoghegan wrote:On Mon, Apr 9, 2018 at 12:13 PM, Andres Freund wrote:Let's lower the pitchforks a bit here. Obviously a grand rewrite isabsurd, as is some of the proposed ways this is all supposed towork. But I think the case we're discussing is much closer to a nearirresolvable corner case than anything else.+1We're talking about the storage layer returning an irresolvableerror. You're hosed even if we report it properly. Yes, it'd be nice ifwe could report it reliably. But that doesn't change the fact that whatwe're doing is ensuring that data is safely fsynced unless storagefails, in which case it's not safely fsynced anyway.Right. We seem to be implicitly assuming that there is a bigdifference between a problem in the storage layer that we could inprinciple detect, but don't, and any other problem in the storagelayer. I've read articles claiming that technologies like SMART arenot really reliable in a practical sense [1], so it seems to me thatthere is reason to doubt that this gap is all that big.That said, I suspect that the problems with running out of disk spaceare serious practical problems. I have personally scoffed at storiesinvolving Postgres databases corruption that gets attributed torunning out of disk space. Looks like I was dead wrong.Yes, I think we need to look at user expectations here.If the device has a hardware write error, it is true that it is good todetect it, and it might be permanent or temporary, e.g. NAS/NFS. Thelonger the error persists, the more likely the user will expectcorruption. However, right now, any length outage could causecorruption, and it will not be reported in all cases.Running out of disk space is also something you don't expect to corruptyour database --- you expect it to only prevent future writes. It seemsNAS/NFS and any thin provisioned storage will have this problem, andagain, not always reported.So, our initial action might just be to educate users that write errorscan cause silent corruption, and out-of-space errors on NAS/NFS and anythin provisioned storage can cause corruption.Kernel logs (not just Postgres logs) should be monitored for theseissues and fail-over/recovering might be necessary.From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-18 09:52:22On Tue, Apr 17, 2018 at 02:34:53PM -0700, Andres Freund wrote:On 2018-04-17 17:29:17 -0400, Bruce Momjian wrote:Also, if we are relying on WAL, we have to make sure WAL is actuallysafe with fsync, and I am betting only the O_DIRECT methods actuallyare safe:> > #wal_sync_method = fsync # the default is the first option> > # supported by the operating system:> > # open_datasync> > --> # fdatasync (default on Linux)> > --> # fsync> > --> # fsync_writethrough> > # open_syncI am betting the marked wal_sync_method methods are not safe since thereis time between the write and fsync.Hm? That's not really the issue though? One issue is that retries arenot necessarily safe in buffered IO, the other that fsync might notreport an error if the fd was closed and opened.Well, we have have been focusing on the delay between backend orcheckpoint writes and checkpoint fsyncs. My point is that we have thesame problem in doing a write, then fsync for the WAL. Yes, the delayis much shorter, but the issue still exists. I realize that newer Linuxkernels will not have the problem since the file descriptor remainsopen, but the problem exists with older/common linux kernels.O_DIRECT is only used if wal archiving or streaming isn't used, whichmakes it pretty useless anyway.Uh, as doesn't 'open_datasync' and 'open_sync' fsync as part of thewrite, meaning we can't lose the error report like we can with theothers?From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-18 10:04:30On 18 April 2018 at 05:19, Bruce Momjian wrote:On Tue, Apr 10, 2018 at 05:54:40PM +0100, Greg Stark wrote:On 10 April 2018 at 02:59, Craig Ringer wrote:Nitpick: In most cases the kernel reserves disk space immediately,before returning from write(). NFS seems to be the main exceptionhere.I'm kind of puzzled by this. Surely NFS servers store the data in thefilesystem using write(2) or the in-kernel equivalent? So if theserver is backed by a filesystem where write(2) preallocates spacesurely the NFS server must behave as if it'spreallocating as well? Iwould expect NFS to provide basically the same set of possiblefailures as the underlying filesystem (as long as you don't enablenosync of course).I don't think the write is sent to the NFS at the time of the write,so while the NFS side would reserve the space, it might get the writerequest until after we return write success to the process.It should be sent if you're using sync mode.From my reading of the docs, if you're using async mode you're alreadyopen to so many potential corruptions you might as well not bother.I need to look into this more re NFS and expand the tests I have tocover that properly.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-18 10:19:28On 10 April 2018 at 20:15, Craig Ringer wrote:On 10 April 2018 at 14:10, Michael Paquier wrote:Well, I think that there is place for improving reporting of failurein file_utils.c for frontends, or at worst have an exit() for any kindof critical failures equivalent to a PANIC.Yup.In the mean time, speaking of PANIC, here's the first cut patch tomake Pg panic on fsync() failures. I need to do some closer review andtesting, but it's presented here for anyone interested.I intentionally left some failures as ERROR not PANIC, where theentire operation is done as a unit, and an ERROR will cause us toretry the whole thing.For example, when we fsync() a temp file before we move it into place,there's no point panicing on failure, because we'll discard the tempfile on ERROR and retry the whole thing.I've verified that it works as expected with some modifications to thetest tool I've been using (pushed).The main downside is that if we panic in redo, we don't try again. Wethrow our toys and shut down. But arguably if we get the same I/Oerror again in redo, that's the right thing to do anyway, and quitelikely safer than continuing to ERROR on checkpoints indefinitely.Patch attached.To be clear, this patch only deals with the issue of us retryingfsyncs when it turns out to be unsafe. This does NOT address any ofthe issues where we won't find out about writeback errors at all.Thinking about this some more, it'll definitely need a GUC to force itto continue despite a potential hazard. Otherwise we go backwards fromthe status quo if we're in a position where uptime is vital andcorrectness problems can be tolerated or repaired later. Kind of likezero_damaged_pages, we'll need some sort ofcontinue_after_fsync_errors .Without that, we'll panic once, enter redo, and if the problempersists we'll panic in redo and exit the startup process. That's notgoing to help users.I'll amend the patch accordingly as time permits.From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-18 11:46:15On Wed, Apr 18, 2018 at 06:04:30PM +0800, Craig Ringer wrote:On 18 April 2018 at 05:19, Bruce Momjian wrote:On Tue, Apr 10, 2018 at 05:54:40PM +0100, Greg Stark wrote:On 10 April 2018 at 02:59, Craig Ringer wrote:Nitpick: In most cases the kernel reserves disk space immediately,before returning from write(). NFS seems to be the main exceptionhere.I'm kind of puzzled by this. Surely NFS servers store the data in thefilesystem using write(2) or the in-kernel equivalent? So if theserver is backed by a filesystem where write(2) preallocates spacesurely the NFS server must behave as if it'spreallocating as well? Iwould expect NFS to provide basically the same set of possiblefailures as the underlying filesystem (as long as you don't enablenosync of course).I don't think the write is sent to the NFS at the time of the write,so while the NFS side would reserve the space, it might get the writerequest until after we return write success to the process.It should be sent if you're using sync mode.From my reading of the docs, if you're using async mode you're already open to so many potential corruptions you might as well not bother.I need to look into this more re NFS and expand the tests I have tocover that properly.So, if sync mode passes the write to NFS, and NFS pre-reserves writespace, and throws an error on reservation failure, that means that NFSwill not corrupt a cluster on out-of-space errors.So, what about thin provisioning? I can understand sharing free spaceamong file systems, but once a write arrives I assume it reserves thespace. Is the problem that many thin provisioning systems don't have async mode, so you can't force the write to appear on the device beforean fsync?From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-18 11:56:57On Tue, Apr 17, 2018 at 02:41:42PM -0700, Andres Freund wrote:On 2018-04-17 17:32:45 -0400, Bruce Momjian wrote:On Mon, Apr 9, 2018 at 03:42:35PM +0200, Tomas Vondra wrote:That doesn't seem like a very practical way. It's better than nothing,of course, but I wonder how would that work with containers (where Ithink you may not have access to the kernel log at all). Also, I'mpretty sure the messages do change based on kernel version (and possiblyfilesystem) so parsing it reliably seems rather difficult. And weprobably don't want to PANIC after I/O error on an unrelated device, sowe'd need to understand which devices are related to PostgreSQL.You can certainly have access to the kernel log in containers. I'dassume such a script wouldn't check various system logs but instead tail/dev/kmsg or such. Otherwise the variance between installations would betoo big.I was thinking 'dmesg', but the result is similar.There's not that many different type of error messages and they don'tchange that often. If we'd just detect error for the most common FSswe'd probably be good. Detecting a few general storage layer messagewouldn't be that hard either, most things have been unified over thelast ~8-10 years.It is hard to know exactly what the message format should be for eachoperating system because it is hard to generate them on demand, and wewould need to filter based on Postgres devices.The other issue is that once you see a message during a checkpoint andexit, you don't want to see that message again after the problem hasbeen fixed and the server restarted. The simplest solution is to savethe output of the last check and look for only new entries. I amattaching a script I run every 15 minutes from cron that emails me anyunexpected kernel messages.I am thinking we would need a contrib module with sample scripts forvarious operating systems.Replying to your specific case, I am not sure how we would use a scriptto check for I/O errors/space-exhaustion if the postgres user doesn'thave access to it.Not sure what you mean?Space exhaustiion can be checked when allocating space, FWIW. We'd justneed to use posix_fallocate et al.I was asking about cases where permissions prevent viewing of kernelmessages. I think you can view them in containers, but in virtualmachines you might not have access to the host operating system's kernelmessages, and that might be where they are. AttachmentContent-TypeSize dmesg_checktext/plain574 bytesFrom:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-18 12:45:53wrOn 18 April 2018 at 19:46, Bruce Momjian wrote:So, if sync mode passes the write to NFS, and NFS pre-reserves writespace, and throws an error on reservation failure, that means that NFSwill not corrupt a cluster on out-of-space errors.Yeah. I need to verify in a concrete test case.The thing is that write() is allowed to be asynchronous anyway. Mostfile systems choose to implement eager reservation of space, but it'snot mandated. AFAICS that's largely a historical accident to keepapplications happy, because FSes used to allocate the space atwrite() time too, and when they moved to delayed allocations, appstended to break too easily unless they at least reserved space. NFSwould have to do a round-trip on write() to reserve space.The Linux man pages (http://man7.org/linux/man-pages/man2/write.2.html) say:A successful return from write() does not make any guarantee thatdata has been committed to disk. On some filesystems, including NFS,it does not even guarantee that space has successfully been reservedfor the data. In this case, some errors might be delayed until afuture write(2), fsync(2), or even close(2). The only way to be sureis to call fsync(2) after you are done writing all your data.... and I'm inclined to believe it when it refuses to make guarantees.Especially lately.So, what about thin provisioning? I can understand sharing free spaceamong file systemsMost thin provisioning is done at the block level, not file systemlevel. So the FS is usually unaware it's on a thin-provisioned volume.Usually the whole kernel is unaware, because the thin provisioning isdone on the SAN end or by a hypervisor. But the same sort of thing maybe done via LVM - see lvmthin. For example, you may make 100 different1TB ext4 FSes, each on 1TB iSCSI volumes backed by SAN with a total of50TB of concrete physical capacity. The SAN is doing block mapping andonly allocating storage chunks to a given volume when the FS haswritten blocks to every previous free block in the previous storagechunk. It may also do things like block de-duplication, compression ofstorage chunks that aren't written to for a while, etc.The idea is that when the SAN's actual physically allocate storagegets to 40TB it starts telling you to go buy another rack of storageso you don't run out. You don't have to resize volumes, resize filesystems, etc. All the storage space admin is centralized on the SANand storage team, and your sysadmins, DBAs and app devs are none thewiser. You buy storage when you need it, not when the DBA demands theyneed a 200% free space margin just in case. Whether or not you agreewith this philosophy or think it's sensible is kind of moot, becauseit's an extremely widespread model, and servers you work on may wellbe backed by thin provisioned storage even if you don't know it.Think of it as a bit like VM overcommit, for storage. You can malloc()as much memory as you like and everything's fine until you try toactually use it. Then you go to dirty a page, no free pages areavailable, and boom.The thing is, the SAN (or LVM) doesn't have any idea about the FS'sinternal in-memory free space counter and its space reservations. Nordoes it understand any FS metadata. All it cares about is "has thisLBA ever been written to by the FS?". If so, it must make sure backingstorage for it exists. If not, it won't bother.Most FSes only touch the blocks on dirty writeback, or sometimeslazily as part of delayed allocation. So if your SAN is running out ofspace and there's 100MB free, each of your 100 FSes may havedecremented its freelist by 2MB and be happily promising more space toapps on write() because, well, as far as they know they're only 50%full. When they all do dirty writeback and flush to storage, kaboom,there's nowhere to put some of the data.I don't know if posix_fallocate is a sufficient safeguard either.You'd have to actually force writes to each page through to thebacking storage to know for sure the space existed. Yes, the docs sayAfter asuccessful call to posix_fallocate(), subsequent writes to bytes inthe specified range are guaranteed not to fail because of lack ofdisk space.... but they're speaking from the filesystem's perspective. If the FSdoesn't dirty and flush the actual blocks, a thin provisioned storagesystem won't know.It's reasonable enough to throw up our hands in this case and say"your setup is crazy, you're breaking the rules, don't do that". Thetruth is they AREN'T breaking the rules, but we can disclaim supportfor such configurations anyway.After all, we tell people not to use Linux's VM overcommit too. How'sthat working for you? I see it enabled on the great majority ofsystems I work with, and some people are very reluctant to turn it offbecause they don't want to have to add swap.If someone has a 50TB SAN and wants to allow for unpredictable spaceuse expansion between various volumes, and we say "you can't do that,go buy a 100TB SAN instead" ... that's not going to go down too welleither. Often we can actually say "make sure the 5TB volume PostgreSQLis using is eagerly provisioned, and expand it at need using onlineresize if required. We don't care about the rest of the SAN.".I guarantee you that when you create a 100GB EBS volume on AWS EC2,you don't get 100GB of storage preallocated. AWS are probably prettygood about not running out of backing store, though.There are file systems optimised for thin provisioning, etc, too.But that's more commonly done by having them do things like zerodeallocated space so the thin provisioning system knows it can returnit to the free pool, and now things like DISCARD provide much of thatsignalling in a standard way.From:Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>Date:2018-04-18 23:31:50On 19/04/18 00:45, Craig Ringer wrote:I guarantee you that when you create a 100GB EBS volume on AWS EC2,you don't get 100GB of storage preallocated. AWS are probably prettygood about not running out of backing store, though.Some db folks (used to anyway) advise dd'ing to your freshly attacheddevices on AWS (for performance mainly IIRC), but that would helpprevent some failure scenarios for any thin provisioned storage (butprobably really annoy the admins' thereof).From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-19 00:44:33On 19 April 2018 at 07:31, Mark Kirkwood wrote:On 19/04/18 00:45, Craig Ringer wrote:I guarantee you that when you create a 100GB EBS volume on AWS EC2,you don't get 100GB of storage preallocated. AWS are probably prettygood about not running out of backing store, though.Some db folks (used to anyway) advise dd'ing to your freshly attacheddevices on AWS (for performance mainly IIRC), but that would help preventsome failure scenarios for any thin provisioned storage (but probably reallyannoy the admins' thereof).This still makes a lot of sense on AWS EBS, particularly when using avolume created from a non-empty snapshot. Performance of S3-snapshotbased EBS volumes is spectacularly awful, since they're copy-on-read.Reading the whole volume helps a lot.From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-20 20:49:08On Wed, Apr 18, 2018 at 08:45:53PM +0800, Craig Ringer wrote:wrOn 18 April 2018 at 19:46, Bruce Momjian wrote:So, if sync mode passes the write to NFS, and NFS pre-reserves writespace, and throws an error on reservation failure, that means that NFSwill not corrupt a cluster on out-of-space errors.Yeah. I need to verify in a concrete test case.Thanks.The thing is that write() is allowed to be asynchronous anyway. Mostfile systems choose to implement eager reservation of space, but it'snot mandated. AFAICS that's largely a historical accident to keepapplications happy, because FSes used to allocate the space atwrite() time too, and when they moved to delayed allocations, appstended to break too easily unless they at least reserved space. NFSwould have to do a round-trip on write() to reserve space.The Linux man pages (http://man7.org/linux/man-pages/man2/write.2.html) say:" A successful return from write() does not make any guarantee that data has been committed to disk. On some filesystems, including NFS, it does not even guarantee that space has successfully been reserved for the data. In this case, some errors might be delayed until a future write(2), fsync(2), or even close(2). The only way to be sure is to call fsync(2) after you are done writing all your data."... and I'm inclined to believe it when it refuses to make guarantees.Especially lately.Uh, even calling fsync after write isn't 100% safe since the kernelcould have flushed the dirty pages to storage, and failed, and the fsyncwould later succeed. I realize newer kernels have that fixed for filesopen during that operation, but that is the minority of installs.The idea is that when the SAN's actual physically allocate storagegets to 40TB it starts telling you to go buy another rack of storageso you don't run out. You don't have to resize volumes, resize filesystems, etc. All the storage space admin is centralized on the SANand storage team, and your sysadmins, DBAs and app devs are none thewiser. You buy storage when you need it, not when the DBA demands theyneed a 200% free space margin just in case. Whether or not you agreewith this philosophy or think it's sensible is kind of moot, becauseit's an extremely widespread model, and servers you work on may wellbe backed by thin provisioned storage even if you don't know it.Most FSes only touch the blocks on dirty writeback, or sometimeslazily as part of delayed allocation. So if your SAN is running out ofspace and there's 100MB free, each of your 100 FSes may havedecremented its freelist by 2MB and be happily promising more space toapps on write() because, well, as far as they know they're only 50%full. When they all do dirty writeback and flush to storage, kaboom,there's nowhere to put some of the data.I see what you are saying --- that the kernel is reserving the writespace from its free space, but the free space doesn't all exist. I amnot sure how we can tell people to make sure the file system free spaceis real.You'd have to actually force writes to each page through to thebacking storage to know for sure the space existed. Yes, the docs say" After a successful call to posix_fallocate(), subsequent writes to bytes in the specified range are guaranteed not to fail because of lack of disk space."... but they're speaking from the filesystem's perspective. If the FSdoesn't dirty and flush the actual blocks, a thin provisioned storagesystem won't know.Frankly, in what cases will a write fail for lack of free space? Itcould be a new WAL file (not recycled), or a pages added to the end ofthe table.Is that it? It doesn't sound too terrible. If we can eliminate thecorruption due to free space exxhaustion, it would be a big stepforward.The next most common failure would be temporary storage failure orstorage communication failure.Permanent storage failure is "game over" so we don't need to worry aboutthat.From:Gasper Zejn <zejn(at)owca(dot)info>Date:2018-04-21 19:21:39Just for the record, I tried the test case with ZFS on Ubuntu 17.10 host with ZFS on Linux 0.6.5.11.ZFS does not swallow the fsync error, but the system does not handle theerror nicely: the test case program hangs on fsync, the load jumps upand there's a bunch of z_wr_iss and z_null_int kernel threads belongingto zfs, eating up the CPU.Even then I managed to reboot the system, so it's not a complete andutter mess.The test case adjustments are here:https://github.com/zejn/scrapcode/commit/e7612536c346d59a4b69bedfbcafbe8c1079063cKind regards,On 29. 03. 2018 07:25, Craig Ringer wrote:On 29 March 2018 at 13:06, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)comOn Thu, Mar 29, 2018 at 6:00 PM, Justin Pryzby> The retries are the source of the problem ; the first fsync() can return EIO,> and also *clears the error* causing a 2nd fsync (of the same data) to return> success.> What I'm failing to grok here is how that error flag even matters,> whether it's a single bit or a counter as described in that patch. If> write back failed, *the page is still dirty*. So all future calls to> fsync() need to try to try to flush it again, and (presumably) fail> again (unless it happens to succeed this time around).You'd think so. But it doesn't appear to work that way. You can seeyourself with the error device-mapper destination mapped over part ofa volume.I wrote a test case here.https://github.com/ringerc/scrapcode/blob/master/testcases/fsync-error-clear.cI don't pretend the kernel behaviour is sane. And it's possible I'vemade an error in my analysis. But since I've observed this in thewild, and seen it in a test case, I strongly suspect that's what I'vedescribed is just what's happening, brain-dead or no.Presumably the kernel marks the page clean when it dispatches it tothe I/O subsystem and doesn't dirty it again on I/O error? I haven'tdug that deep on the kernel side. See the stackoverflow post fordetails on what I found in kernel code analysis.From:Andres Freund <andres(at)anarazel(dot)de>Date:2018-04-23 20:14:48Hi,On 2018-03-28 10:23:46 +0800, Craig Ringer wrote:TL;DR: Pg should PANIC on fsync() EIO return. Retrying fsync() is not OK atleast on Linux. When fsync() returns success it means "all writes since thelast fsync have hit disk" but we assume it means "all writes since the lastSUCCESSFUL fsync have hit disk".But then we retried the checkpoint, which retried the fsync(). The retrysucceeded, because the prior fsync() cleared the AS_EIO bad page flag.Random other thing we should look at: Some filesystems (nfs yes, xfsext4 no) flush writes at close(2). We check close() return code, justlog it... So close() counts as an fsync for such filesystems().I'm LSF/MM to discuss future behaviour of linux here, but that's how itis right now.From:Bruce Momjian <bruce(at)momjian(dot)us>Date:2018-04-24 00:09:23On Mon, Apr 23, 2018 at 01:14:48PM -0700, Andres Freund wrote:Hi,On 2018-03-28 10:23:46 +0800, Craig Ringer wrote:TL;DR: Pg should PANIC on fsync() EIO return. Retrying fsync() is not OK atleast on Linux. When fsync() returns success it means "all writes since thelast fsync have hit disk" but we assume it means "all writes since the lastSUCCESSFUL fsync have hit disk".But then we retried the checkpoint, which retried the fsync(). The retrysucceeded, because the prior fsync() cleared the AS_EIO bad page flag.Random other thing we should look at: Some filesystems (nfs yes, xfsext4 no) flush writes at close(2). We check close() return code, justlog it... So close() counts as an fsync for such filesystems().Well, that's interesting. You might remember that NFS does not reservespace for writes like local file systems like ext4/xfs do. For thatreason, we might be able to capture the out-of-space error on close andexit sooner for NFS.From:Craig Ringer <craig(at)2ndquadrant(dot)com>Date:2018-04-26 02:16:52On 24 April 2018 at 04:14, Andres Freund wrote:I'm LSF/MM to discuss future behaviour of linux here, but that's how itis right now.Interim LWN.net coverage of that can be found here:https://lwn.net/Articles/752613/From:Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>Date:2018-04-27 01:18:55On Tue, Apr 24, 2018 at 12:09 PM, Bruce Momjian wrote:On Mon, Apr 23, 2018 at 01:14:48PM -0700, Andres Freund wrote:Hi,On 2018-03-28 10:23:46 +0800, Craig Ringer wrote:TL;DR: Pg should PANIC on fsync() EIO return. Retrying fsync() is not OK atleast on Linux. When fsync() returns success it means "all writes since thelast fsync have hit disk" but we assume it means "all writes since the lastSUCCESSFUL fsync have hit disk".But then we retried the checkpoint, which retried the fsync(). The retrysucceeded, because the prior fsync() cleared the AS_EIO bad page flag.Random other thing we should look at: Some filesystems (nfs yes, xfsext4 no) flush writes at close(2). We check close() return code, justlog it... So close() counts as an fsync for such filesystems().Well, that's interesting. You might remember that NFS does not reservespace for writes like local file systems like ext4/xfs do. For thatreason, we might be able to capture the out-of-space error on close andexit sooner for NFS.It seems like some implementations flush on close and thereforediscover ENOSPC problem at that point, unless they have NVSv4 (RFC3050) "write delegation" with a promise from the server that a certainamount of space is available. It seems like you can't count on thatin any way though, because it's the server that decides when todelegate and how much space to promise is preallocated, not theclient. So in userspace you always need to be able to handle errorsincluding ENOSPC returned by close(), and if you ignore that andyou're using an operating system that immediately incinerates allevidence after telling you that (so that later fsync() doesn't fail),you're in trouble.Some relevant code:https://github.com/torvalds/linux/commit/5445b1fbd123420bffed5e629a420aa2a16bf849https://github.com/freebsd/freebsd/blob/master/sys/fs/nfsclient/nfs_clvnops.c#L618It looks like the bleeding edge of the NFS spec includes a newALLOCATE operation that should be able to support posix_fallocate()(if we were to start using that for extending files):https://tools.ietf.org/html/rfc7862#page-64I'm not sure how reliable [posix_]fallocate is on NFS in generalthough, and it seems that there are fall-back implementations ofposix_fallocate() that write zeros (or even just feign success?) whichprobably won't do anything useful here if not also flushed (thatfallback strategy might only work on eager reservation filesystemsthat don't have direct fallocate support?) so there are several layers(libc, kernel, nfs client, nfs server) that'd need to be aligned forthat to work, and it's not clear how a humble userspace program issupposed to know if they are.I guess if you could find a way to amortise the cost of extending(like Oracle et al do by extending big container datafiles 10MB at atime or whatever), then simply writing zeros and flushing when doingthat might work out OK, so you wouldn't need such a thing? (Unless ofcourse it's a COW filesystem, but that's a different can of worms.)This thread continues on the ext4 mailing list:From: "Joshua D. Drake" <jd@...mandprompt.com>Subject: fsync() errors is unsafe and risks data lossDate: Tue, 10 Apr 2018 09:28:15 -0700-ext4,If this is not the appropriate list please point me in the rightdirection. I am a PostgreSQL contributor and we have come across areliability problem with writes and fsync(). You can see the thread here:https://www.postgresql.org/message-id/flat/20180401002038.GA2211%40paquier.xyz#20180401002038.GA2211@paquier.xyzThe tl;dr; in the first message doesn't quite describe the problem as westarted to dig into it further.From: "Darrick J. Wong" <darrick.wong@...cle.com>Date: Tue, 10 Apr 2018 09:54:43 -0700On Tue, Apr 10, 2018 at 09:28:15AM -0700, Joshua D. Drake wrote:-ext4,If this is not the appropriate list please point me in the right direction.I am a PostgreSQL contributor and we have come across a reliability problemwith writes and fsync(). You can see the thread here:https://www.postgresql.org/message-id/flat/20180401002038.GA2211%40paquier.xyz#20180401002038.GA2211@paquier.xyzThe tl;dr; in the first message doesn't quite describe the problem as westarted to dig into it further.You might try the XFS list (linux-xfs@...r.kernel.org) seeing as theinitial complaint is against xfs behaviors...From: "Joshua D. Drake" <jd@...mandprompt.com>Date: Tue, 10 Apr 2018 09:58:21 -0700On 04/10/2018 09:54 AM, Darrick J. Wong wrote:On Tue, Apr 10, 2018 at 09:28:15AM -0700, Joshua D. Drake wrote:-ext4,If this is not the appropriate list please point me in the right direction.I am a PostgreSQL contributor and we have come across a reliability problemwith writes and fsync(). You can see the thread here:https://www.postgresql.org/message-id/flat/20180401002038.GA2211%40paquier.xyz#20180401002038.GA2211@paquier.xyzThe tl;dr; in the first message doesn't quite describe the problem as westarted to dig into it further.You might try the XFS list (linux-xfs@...r.kernel.org) seeing as theinitial complaint is against xfs behaviors...Later in the thread it becomes apparent that it applies to ext4 (NFStoo) as well. I picked ext4 because I assumed it is the most populatedof the lists since its the default filesystem for most distributions.From: "Theodore Y. Ts'o" <tytso@....edu>Date: Tue, 10 Apr 2018 14:43:56 -0400Hi Joshua,This isn't actually an ext4 issue, but a long-standing VFS/MM issue.There are going to be multiple opinions about what the right thing todo. I'll try to give as unbiased a description as possible, butcertainly some of this is going to be filtered by my own biases nomatter how careful I can be.First of all, what storage devices will do when they hit an exceptioncondition is quite non-deterministic. For example, the vast majorityof SSD's are not power fail certified. What this means is that ifthey suffer a power drop while they are doing a GC, it is quitepossible for data written six months ago to be lost as a result. TheLBA could potentialy be far, far away from any LBA's that wererecently written, and there could have been multiple CACHE FLUSHoperations in the since the LBA in question was last written sixmonths ago. No matter; for a consumer-grade SSD, it's possible forthat LBA to be trashed after an unexpected power drop.Which is why after a while, one can get quite paranoid and assume thatthe only way you can guarantee data robustness is to store multiplecopies and/or use erasure encoding, with some of the copies or shardswritten to geographically diverse data centers.Secondly, I think it's fair to say that the vast majority of thecompanies who require data robustness, and are either willing to pay$$$ to an enterprise distro company like Red Hat, or command a largeenough paying customer base that they can afford to dictate terms toan enterprise distro, or hire a consultant such as Christoph, or havetheir own staffed Linux kernel teams, have tended to use O_DIRECT. Sofor better or for worse, there has not been as much investment inbuffered I/O and data robustness in the face of exception handling ofstorage devices.Next, the reason why fsync() has the behaviour that it does is oneofhe the most common cases of I/O storage errors in buffered usecases, certainly as seen by the community distros, is the user whopulls out USB stick while it is in use. In that case, if there aredirtied pages in the page cache, the question is what can you do?Sooner or later the writes will time out, and if you leave the pagesdirty, then it effectively becomes a permanent memory leak. You can'tunmount the file system --- that requires writing out all of the pagessuch that the dirty bit is turned off. And if you don't clear thedirty bit on an I/O error, then they can never be cleaned. You can'teven re-insert the USB stick; the re-inserted USB stick will get a newblock device. Worse, when the USB stick was pulled, it will havesuffered a power drop, and see above about what could happen after apower drop for non-power fail certified flash devices --- it goesdouble for the cheap sh*t USB sticks found in the checkout aisle ofMicro Center.So this is the explanation for why Linux handles I/O errors byclearing the dirty bit after reporting the error up to user space.And why there is not eagerness to solve the problem simply by "don'tclear the dirty bit". For every one Postgres installation that mighthave a better recover after an I/O error, there's probably a thousandclueless Fedora and Ubuntu users who will have a much worse userexperience after a USB stick pull happens.I can think of things that could be done --- for example, it could beswitchable on a per-block device basis (or maybe a per-mount basis)whether or not the dirty bit gets cleared after the error is reportedto userspace. And perhaps there could be a new unmount flag thatcauses all dirty pages to be wiped out, which could be used to recoverafter a permanent loss of the block device. But the question is whois going to invest the time to make these changes? If there is acompany who is willing to pay to comission this work, it's almostcertainly soluble. Or if a company which has a kernel on staff iswilling to direct an engineer to work on it, it certainly could besolved. But again, of the companies who have client code where wecare about robustness and proper handling of failed disk drives, andwhich have a kernel team on staff, pretty much all of the ones I canthink of (e.g., Oracle, Google, etc.) use O_DIRECT and they don't tryto make buffered writes and error reporting via fsync(2) work well.In general these companies want low-level control over buffer cacheeviction algorithms, which drives them towards the design decision ofeffectively implementing the page cache in userspace, and usingO_DIRECT reads/writes.If you are aware of a company who is willing to pay to have a newkernel feature implemented to meet your needs, we might be able torefer you to a company or a consultant who might be able to do thatwork. Let me know off-line if that's the case...From: Andreas Dilger <adilger@...ger.ca>Date: Tue, 10 Apr 2018 13:44:48 -0600On Apr 10, 2018, at 10:50 AM, Joshua D. Drake jd@...mandprompt.com wrote:-ext4,If this is not the appropriate list please point me in the right direction. I am a PostgreSQL contributor and we have come across a reliability problem with writes and fsync(). You can see the thread here:https://www.postgresql.org/message-id/flat/20180401002038.GA2211%40paquier.xyz#20180401002038.GA2211@paquier.xyzThe tl;dr; in the first message doesn't quite describe the problem as we started to dig into it further.Yes, this is a very long thread. The summary is Postgres is unhappy thatfsync() on Linux (and also other OSes) returns an error once if there wasa prior write() failure, instead of keeping dirty pages in memory foreverand trying to rewrite them.This behaviour has existed on Linux forever, and (for better or worse) isthe only reasonable behaviour that the kernel can take. I've argued forthe opposite behaviour at times, and some subsystems already do limitedretries before finally giving up on a failed write, though there are alsotimes when retrying at lower levels is pointless if a higher level ofcode can handle the failure (e.g. mirrored block devices, filesystem datamirroring, userspace data mirroring, or cross-node replication).The confusion is whether fsync() is a "level" state (return error foreverif there were pages that could not be written), or an "edge" state (returnerror only for any write failures since the previous fsync() call).I think Anthony Iliopoulos was pretty clear in his multiple descriptionsin that thread of why the current behaviour is needed (OOM of the wholesystem if dirty pages are kept around forever), but many others were stuckon "I can't believe this is happening??? This is totally unacceptable andevery kernel needs to change to match my expectations!!!" without lookingat the larger picture of what is practical to change and where the issueshould best be fixed.Regardless of why this is the case, the net is that PG needs to deal withall of the systems that currently exist that have this behaviour, even ifsome day in the future it may change (though that is unlikely). It seemsironic that "keep dirty pages in userspace until fsync() returns success"is totally unacceptable, but "keep dirty pages in the kernel" is fine.My (limited) understanding of databases was that they preferred to cacheeverything in userspace and use O_DIRECT to write to disk (which returnsan error immediately if the write fails and does not double buffer data).From: Martin Steigerwald martin@...htvoll.deDate: Tue, 10 Apr 2018 21:47:21 +0200Hi Theodore, Darrick, Joshua.CC d fsdevel as it does not appear to be Ext4 specific to me (and to you aswell, Theodore).Theodore Y. Ts'o - 10.04.18, 20:43:This isn't actually an ext4 issue, but a long-standing VFS/MM issue.[ ]First of all, what storage devices will do when they hit an exceptioncondition is quite non-deterministic. For example, the vast majorityof SSD's are not power fail certified. What this means is that ifthey suffer a power drop while they are doing a GC, it is quitepossible for data written six months ago to be lost as a result. TheLBA could potentialy be far, far away from any LBA's that wererecently written, and there could have been multiple CACHE FLUSHoperations in the since the LBA in question was last written sixmonths ago. No matter; for a consumer-grade SSD, it's possible forthat LBA to be trashed after an unexpected power drop.Guh. I was not aware of this. I knew consumer-grade SSDs often do not havepower loss protection, but still thought they d handle garble collection in anatomic way. Sometimes I am tempted to sing an "all hardware is crap" song(starting with Meltdown/Spectre, then probably heading over to storage devicesand so on including firmware crap like Intel ME).Next, the reason why fsync() has the behaviour that it does is oneofhe the most common cases of I/O storage errors in buffered usecases, certainly as seen by the community distros, is the user whopulls out USB stick while it is in use. In that case, if there aredirtied pages in the page cache, the question is what can you do?Sooner or later the writes will time out, and if you leave the pagesdirty, then it effectively becomes a permanent memory leak. You can'tunmount the file system --- that requires writing out all of the pagessuch that the dirty bit is turned off. And if you don't clear thedirty bit on an I/O error, then they can never be cleaned. You can'teven re-insert the USB stick; the re-inserted USB stick will get a newblock device. Worse, when the USB stick was pulled, it will havesuffered a power drop, and see above about what could happen after apower drop for non-power fail certified flash devices --- it goesdouble for the cheap sh*t USB sticks found in the checkout aisle ofMicro Center.From the original PostgreSQL mailing list thread I did not get on how exactly FreeBSD differs in behavior, compared to Linux. I am aware of one operating system that from a user point of view handles this in almost the right way IMHO: AmigaOS.When you removed a floppy disk from the drive while the OS was writing to itit showed a "You MUST insert volume somename into drive somedrive:" and ifyou did, it just continued writing. (The part that did not work well was thatwith the original filesystem if you did not insert it back, the whole disk wascorrupted, usually to the point beyond repair, so the "MUST" was no joke.)In my opinion from a user s point of view this is the only sane way to handlethe premature removal of removable media. I have read of a GSoC project toimplement something like this for NetBSD but I did not check on the outcome ofit. But in MS-DOS I think there has been something similar, however MS-DOS isnot an multitasking operating system as AmigaOS is.Implementing something like this for Linux would be quite a feat, I think,cause in addition to the implementation in the kernel, the desktop environmentor whatever other userspace you use would need to handle it as well, so you dhave to adapt udev / udisks / probably Systemd. And probably this behaviorneeds to be restricted to anything that is really removable and even then inorder to prevent memory exhaustion in case processes continue to write to anremoved and not yet re-inserted USB harddisk the kernel would need to halt I/Oprocesses which dirty I/O to this device. (I believe this is what AmigaOS did.It just blocked all subsequent I/O to the device still it was re-inserted. Butthen the I/O handling in that OS at that time is quite different from whatLinux does.)So this is the explanation for why Linux handles I/O errors byclearing the dirty bit after reporting the error up to user space.And why there is not eagerness to solve the problem simply by "don'tclear the dirty bit". For every one Postgres installation that mighthave a better recover after an I/O error, there's probably a thousandclueless Fedora and Ubuntu users who will have a much worse userexperience after a USB stick pull happens.I was not aware that flash based media may be as crappy as you hint at.From my tests with AmigaOS 4.something or AmigaOS 3.9 + 3rd Party Poseidon USB stack the above mechanism worked even with USB sticks. I however did not test this often and I did not check for data corruption after a test.From: Andres Freund <andres@...razel.de>Date: Tue, 10 Apr 2018 15:07:26 -0700(Sorry if I screwed up the thread structure - I'd to reconstruct thereply-to and CC list from web archive as I've not found a way toproperly download an mbox or such of old content. Was subscribed tofsdevel but not ext4 lists)Hi,2018-04-10 18:43:56 Ted wrote:I'll try to give as unbiased a description as possible, but certainlysome of this is going to be filtered by my own biases no matter howcareful I can be.Same ;)2018-04-10 18:43:56 Ted wrote:So for better or for worse, there has not been as much investment inbuffered I/O and data robustness in the face of exception handling ofstorage devices.That's a bit of a cop out. It's not just databases that care. Even morebasic tools like SCM, package managers and editors care whether they canproper responses back from fsync that imply things actually were synced.2018-04-10 18:43:56 Ted wrote:So this is the explanation for why Linux handles I/O errors byclearing the dirty bit after reporting the error up to user space.And why there is not eagerness to solve the problem simply by "don'tclear the dirty bit". For every one Postgres installation that mighthave a better recover after an I/O error, there's probably a thousandclueless Fedora and Ubuntu users who will have a much worse userexperience after a USB stick pull happens.I don't think these necessarily are as contradictory goals as you paintthem. At least in postgres' case we can deal with the fact that anfsync retry isn't going to fix the problem by reentering crash recoveryor just shutting down - therefore we don't need to keep all the dirtybuffers around. A per-inode or per-superblock bit that causes furtherfsyncs to fail would be entirely sufficent for that.While there's some differing opinions on the referenced postgres thread,the fundamental problem isn't so much that a retry won't fix theproblem, it's that we might NEVER see the failure. If writeback happensin the background, encounters an error, undirties the buffer, we willhappily carry on because we've never seen that. That's when we'remajorly screwed.Both in postgres, and a lot of other applications, it's not at allguaranteed to consistently have one FD open for every filewrittten. Therefore even the more recent per-fd errseq logic doesn'tguarantee that the failure will ever be seen by an applicationdiligently fsync()ing.You'd not even need to have per inode information or such in the casethat the block device goes away entirely. As the FS isn't generallyunmounted in that case, you could trivially keep a per-mount (orsuperblock?) bit that says "I died" and set that instead of keeping perinode/whatever information.2018-04-10 18:43:56 Ted wrote:If you are aware of a company who is willing to pay to have a newkernel feature implemented to meet your needs, we might be able torefer you to a company or a consultant who might be able to do thatwork.I find it a bit dissapointing response. I think it's fair to say thatfor advanced features, but we're talking about the basic guarantee thatfsync actually does something even remotely reasonable.2018-04-10 19:44:48 Andreas wrote:The confusion is whether fsync() is a "level" state (return errorforever if there were pages that could not be written), or an "edge"state (return error only for any write failures since the previousfsync() call).I don't think that's the full issue. We can deal with the fact that anfsync failure is edge-triggered if there's a guarantee that everyprocess doing so would get it. The fact that one needs to have an FDopen from before any failing writes occurred to get a failure, THAT'Sthe big issue.Beyond postgres, it's a pretty common approach to do work on a lot offiles without fsyncing, then iterate over the directory fsynceverything, and then assume you're safe. But unless I severalymisunderstand something that'd only be safe if you kept an FD for everyfile open, which isn't realistic for pretty obvious reasons.2018-04-10 18:43:56 Ted wrote:I think Anthony Iliopoulos was pretty clear in his multipledescriptions in that thread of why the current behaviour is needed(OOM of the whole system if dirty pages are kept around forever), butmany others were stuck on "I can't believe this is happening??? Thisis totally unacceptable and every kernel needs to change to match myexpectations!!!" without looking at the larger picture of what ispractical to change and where the issue should best be fixed.Everone can participate in discussions...From: Andreas Dilger <adilger@...ger.ca>Date: Wed, 11 Apr 2018 15:52:44 -0600On Apr 10, 2018, at 4:07 PM, Andres Freund andres@...razel.de wrote:2018-04-10 18:43:56 Ted wrote:So for better or for worse, there has not been as much investment inbuffered I/O and data robustness in the face of exception handling ofstorage devices.That's a bit of a cop out. It's not just databases that care. Even morebasic tools like SCM, package managers and editors care whether they canproper responses back from fsync that imply things actually were synced.Sure, but it is mostly PG that is doing (IMHO) crazy things like writingto thousands(?) of files, closing the file descriptors, then expectingfsync() on a newly-opened fd to return a historical error. If an editortries to write a file, then calls fsync and gets an error, the user willenter a new pathname and retry the write. The package manager will assumethe package installation failed, and uninstall the parts of the packagethat were already written.There is no way the filesystem can handle the package manager failure case,and keeping the pages dirty and retrying indefinitely may never work (e.g.disk is dead or disconnected, is a sparse volume without any free space,etc). This (IMHO) implies that the higher layer (which knows more aboutwhat the write failure implies) needs to deal with this.2018-04-10 18:43:56 Ted wrote:So this is the explanation for why Linux handles I/O errors byclearing the dirty bit after reporting the error up to user space.And why there is not eagerness to solve the problem simply by "don'tclear the dirty bit". For every one Postgres installation that mighthave a better recover after an I/O error, there's probably a thousandclueless Fedora and Ubuntu users who will have a much worse userexperience after a USB stick pull happens.I don't think these necessarily are as contradictory goals as you paintthem. At least in postgres' case we can deal with the fact that anfsync retry isn't going to fix the problem by reentering crash recoveryor just shutting down - therefore we don't need to keep all the dirtybuffers around. A per-inode or per-superblock bit that causes furtherfsyncs to fail would be entirely sufficent for that.While there's some differing opinions on the referenced postgres thread,the fundamental problem isn't so much that a retry won't fix theproblem, it's that we might NEVER see the failure. If writeback happensin the background, encounters an error, undirties the buffer, we willhappily carry on because we've never seen that. That's when we'remajorly screwed.I think there are two issues here - "fsync() on an fd that was just opened"and "persistent error state (without keeping dirty pages in memory)".If there is background data writeback without an open file descriptor,there is no mechanism for the kernel to return an error to any applicationwhich may exist, or may not ever come back.Consider if there was a per-inode "there was once an error writing thisinode" flag. Then fsync() would return an error on the inode forever,since there is no way in POSIX to clear this state, since it would needto be kept in case some new fd is opened on the inode and does an fsync()and wants the error to be returned.IMHO, the only alternative would be to keep the dirty pages in memoryuntil they are written to disk. If that was not possible, what then?It would need a reboot to clear the dirty pages, or truncate the file(discarding all data)?Both in postgres, and a lot of other applications, it's not at allguaranteed to consistently have one FD open for every filewritten. Therefore even the more recent per-fd errseq logic doesn'tguarantee that the failure will ever be seen by an applicationdiligently fsync()ing.... only if the application closes all fds for the file before callingfsync. If any fd is kept open from the time of the failure, it willreturn the original error on fsync() (and then no longer return it).It's not that you need to keep every fd open forever. You could put theminto a shared pool, and re-use them if the file is "re-opened", and callfsync on each fd before it is closed (because the pool is getting too bigor because you want to flush the data for that file, or shut down the DB).That wouldn't require a huge re-architecture of PG, just a small libraryto handle the shared fd pool.That might even improve performance, because opening and closing filesis itself not free, especially if you are working with remote filesystems.You'd not even need to have per inode information or such in the casethat the block device goes away entirely. As the FS isn't generallyunmounted in that case, you could trivially keep a per-mount (orsuperblock?) bit that says "I died" and set that instead of keeping perinode/whatever information.The filesystem will definitely return an error in this case, I don'tthink this needs any kind of changes:int ext4_sync_file(struct file *file, loff_t start, loff_t end, int datasync){ if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb)))) return -EIO;2018-04-10 18:43:56 Ted wrote:If you are aware of a company who is willing to pay to have a newkernel feature implemented to meet your needs, we might be able torefer you to a company or a consultant who might be able to do thatwork.I find it a bit dissapointing response. I think it's fair to say thatfor advanced features, but we're talking about the basic guarantee thatfsync actually does something even remotely reasonable.Linux (as PG) is run by people who develop it for their own needs, orare paid to develop it for the needs of others. Everyone already hastoo much work to do, so you need to find someone who has an interestin fixing this (IMHO very peculiar) use case. If PG developers wantto add a tunable "keep dirty pages in RAM on IO failure", I don't thinkthat it would be too hard for someone to do. It might be harder toconvince some of the kernel maintainers to accept it, and I've been onthe losing side of that battle more than once. However, like everythingyou don't pay for, you can't require someone else to do this for you.It wouldn't hurt to see if Jeff Layton, who wrote the errseq patches,would be interested to work on something like this.That said, even if a fix was available for Linux tomorrow, it wouldbe years before a majority of users would have it available on theirsystem, that includes even the errseq mechanism that was landed a fewmonths ago. That implies to me that you'd want something that fixes PGnow so that it works around whatever (perceived) breakage exists inthe Linux fsync() implementation. Since the thread indicates thatnon-Linux kernels have the same fsync() behaviour, it makes sense to dothat even if the Linux fix was available.2018-04-10 19:44:48 Andreas wrote:The confusion is whether fsync() is a "level" state (return errorforever if there were pages that could not be written), or an "edge"state (return error only for any write failures since the previousfsync() call).I don't think that's the full issue. We can deal with the fact that anfsync failure is edge-triggered if there's a guarantee that everyprocess doing so would get it. The fact that one needs to have an FDopen from before any failing writes occurred to get a failure, THAT'Sthe big issue.Beyond postgres, it's a pretty common approach to do work on a lot offiles without fsyncing, then iterate over the directory fsynceverything, and then assume you're safe. But unless I severalymisunderstand something that'd only be safe if you kept an FD forevery file open, which isn't realistic for pretty obvious reasons.I can't say how common or uncommon such a workload is, though PG is theonly application that I've heard of doing it, and I've been working onfilesystems for 20 years. I'm a bit surprised that anyone expectsfsync() on a newly-opened fd to have any state from write() calls thatpredate the open. I can understand fsync() returning an error for anyIO that happens within the context of that fsync(), but how far shouldit go back for reporting errors on that file? Forever? The onlyway to clear the error would be to reboot the system, since I'm notaware of any existing POSIX code to clear such an errorFrom: Dave Chinner <david@...morbit.com>Date: Thu, 12 Apr 2018 10:09:16 +1000On Wed, Apr 11, 2018 at 03:52:44PM -0600, Andreas Dilger wrote:> On Apr 10, 2018, at 4:07 PM, Andres Freund andres@...razel.de wrote:> > 2018-04-10 18:43:56 Ted wrote:> >> So for better or for worse, there has not been as much investment in> >> buffered I/O and data robustness in the face of exception handling of> >> storage devices.> >> > That's a bit of a cop out. It's not just databases that care. Even more> > basic tools like SCM, package managers and editors care whether they can> > proper responses back from fsync that imply things actually were synced.>> Sure, but it is mostly PG that is doing (IMHO) crazy things like writing> to thousands(?) of files, closing the file descriptors, then expecting> fsync() on a newly-opened fd to return a historical error.Yeah, this seems like a recipe for disaster, especially oncross-platform code where every OS platform behaves differently andalmost never to expectation.And speaking of "behaving differently to expectations", nobody hasmentioned that close() can also return write errors. Hence if you dowrite - close - open - fsync the the write error might get reportedon close, not fsync. IOWs, the assumption that "async writebackerrors will persist across close to open" is fundamentally broken tobegin with. It's even documented as a slient data loss vector inthe close(2) man page:$ man 2 close..... Dealing with error returns from close() A careful programmer will check the return value of close(), since it is quite possible that errors on a previous write(2) operation are reported only on the final close() that releases the open file description. Failing to check the return value when closing a file may lead to silent loss of data. This can especially be observed with NFS and with disk quota.Yeah, ensuring data integrity in the face of IO errors is a reallyhard problem. :/To pound the broken record: there are many good reasons why Linuxfilesystem developers have said "you should use direct IO" to the PGdevs each time we have this "the kernel doesn't do [complex thingsPG needs]" discussion.In this case, robust IO error reporting is easy with DIO. It's oneof the reasons most of the high performance database engines areeither using or moving to non-blocking AIO+DIO (RWF_NOWAIT) and useO_DSYNC/RWF_DSYNC for integrity-critical IO dispatch. This is alsobeing driven by the availability of high performance, high IOPSsolid state storage where buffering in RAM to optimise IO patternsand throughput provides no real performance benefit.Using the AIO+DIO infrastructure ensures errors are reported for thespecific write that fails at failure time (i.e. in the aiocompletion event for the specific IO), yet high IO throughput can bemaintained without the application needing it's own threadinginfrastructure to prevent blocking.This means the application doesn't have to guess where the writeerror occurred to retry/recover, have to handle async write errorson close(), have to use fsync() to gather write IO errors and theninfer where the IO failure was, or require kernels on everysupported platform to jump through hoops to try to do exactly theright thing in error conditions for everyone in all circumstances atall times....From: Andres Freund <andres@...razel.de>Date: Wed, 11 Apr 2018 19:17:52 -0700On 2018-04-11 15:52:44 -0600, Andreas Dilger wrote:On Apr 10, 2018, at 4:07 PM, Andres Freund andres@...razel.de wrote:2018-04-10 18:43:56 Ted wrote:So for better or for worse, there has not been as much investment inbuffered I/O and data robustness in the face of exception handling ofstorage devices.That's a bit of a cop out. It's not just databases that care. Even morebasic tools like SCM, package managers and editors care whether they canproper responses back from fsync that imply things actually were synced.Sure, but it is mostly PG that is doing (IMHO) crazy things like writingto thousands(?) of files, closing the file descriptors, then expectingfsync() on a newly-opened fd to return a historical error.It's not just postgres. dpkg (underlying apt, on debian derived distros)to take an example I just randomly guessed, does too: /* We want to guarantee the extracted files are on the disk, so that the * subsequent renames to the info database do not end up with old or zero * length files in case of a system crash. As neither dpkg-deb nor tar do * explicit fsync()s, we have to do them here. * XXX: This could be avoided by switching to an internal tar extractor. */ dir_sync_contents(cidir);(a bunch of other places too)Especially on ext3 but also on newer filesystems it's performancewiseentirely infeasible to fsync() every single file individually - theperformance becomes entirely attrocious if you do that.I think there's some legitimate arguments that a database should usedirect IO (more on that as a reply to David), but claiming that allsorts of random utilities need to use DIO with buffering etc is justinsane.If an editor tries to write a file, then calls fsync and gets anerror, the user will enter a new pathname and retry the write. Thepackage manager will assume the package installation failed, anduninstall the parts of the package that were already written.Except that they won't notice that they got a failure, at least in thedpkg case. And happily continue installing corrupted dataThere is no way the filesystem can handle the package manager failure case,and keeping the pages dirty and retrying indefinitely may never work (e.g.disk is dead or disconnected, is a sparse volume without any free space,etc). This (IMHO) implies that the higher layer (which knows more aboutwhat the write failure implies) needs to deal with this.Yea, I agree that'd not be sane. As far as I understand the dpkg code(all of 10min reading it), that'd also be unnecessary. It can abort theinstallation, but only if it detects the error. Which isn't happening.While there's some differing opinions on the referenced postgres thread,the fundamental problem isn't so much that a retry won't fix theproblem, it's that we might NEVER see the failure. If writeback happensin the background, encounters an error, undirties the buffer, we willhappily carry on because we've never seen that. That's when we'remajorly screwed.I think there are two issues here - "fsync() on an fd that was just opened"and "persistent error state (without keeping dirty pages in memory)".If there is background data writeback without an open file descriptor,there is no mechanism for the kernel to return an error to any applicationwhich may exist, or may not ever come back.And that's horrible. If I cp a file, and writeback fails in thebackground, and I then cat that file before restarting, I should be ableto see that that failed. Instead of returning something bogus.Or even more extreme, you untar/zip/git clone a directory. Then do async. And you don't know whether anything actually succeeded.Consider if there was a per-inode "there was once an error writing thisinode" flag. Then fsync() would return an error on the inode forever,since there is no way in POSIX to clear this state, since it would needto be kept in case some new fd is opened on the inode and does an fsync()and wants the error to be returned.The data in the file also is corrupt. Having to unmount or delete thefile to reset the fact that it can't safely be assumed to be on diskisn't insane.Both in postgres, and a lot of other applications, it's not at allguaranteed to consistently have one FD open for every filewritten. Therefore even the more recent per-fd errseq logic doesn'tguarantee that the failure will ever be seen by an applicationdiligently fsync()ing.... only if the application closes all fds for the file before callingfsync. If any fd is kept open from the time of the failure, it willreturn the original error on fsync() (and then no longer return it).It's not that you need to keep every fd open forever. You could put theminto a shared pool, and re-use them if the file is "re-opened", and callfsync on each fd before it is closed (because the pool is getting too bigor because you want to flush the data for that file, or shut down the DB).That wouldn't require a huge re-architecture of PG, just a small libraryto handle the shared fd pool.Except that postgres uses multiple processes. And works on a lot ofarchitectures. If we started to fsync all opened files on process exitour users would lynch us. We'd need a complicated scheme that sendsprocesses across sockets between processes, then deduplicate them on thereceiving side, somehow figuring out which is the oldest filedescriptors(handling clockdrift safely).Note that it'd be perfectly fine that we've "thrown away" the buffercontents if we'd get notified that the fsync failed. We could just doWAL replay, and restore the contents (just was we do after crashesand/or for replication).That might even improve performance, because opening and closing filesis itself not free, especially if you are working with remote filesystems.There's already a per-process cache of open files.You'd not even need to have per inode information or such in the casethat the block device goes away entirely. As the FS isn't generallyunmounted in that case, you could trivially keep a per-mount (orsuperblock?) bit that says "I died" and set that instead of keeping perinode/whatever information.The filesystem will definitely return an error in this case, I don'tthink this needs any kind of changes:int ext4_sync_file(struct file *file, loff_t start, loff_t end, int datasync){ if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb)))) return -EIO;Well, I'm making that argument because several people argued thatthrowing away buffer contents in this case is the only way to not causeOOMs, and that that's incompatible with reporting errors. It's clearlynot...2018-04-10 18:43:56 Ted wrote:If you are aware of a company who is willing to pay to have a newkernel feature implemented to meet your needs, we might be able torefer you to a company or a consultant who might be able to do thatwork.I find it a bit dissapointing response. I think it's fair to say thatfor advanced features, but we're talking about the basic guarantee thatfsync actually does something even remotely reasonable.Linux (as PG) is run by people who develop it for their own needs, orare paid to develop it for the needs of others.Sure.Everyone already has too much work to do, so you need to find someonewho has an interest in fixing this (IMHO very peculiar) use case. IfPG developers want to add a tunable "keep dirty pages in RAM on IOfailure", I don't think that it would be too hard for someone to do.It might be harder to convince some of the kernel maintainers toaccept it, and I've been on the losing side of that battle more thanonce. However, like everything you don't pay for, you can't requiresomeone else to do this for you. It wouldn't hurt to see if JeffLayton, who wrote the errseq patches, would be interested to work onsomething like this.I don't think this is that PG specific, as explained above.From: Andres Freund <andres@...razel.de>Date: Wed, 11 Apr 2018 19:32:21 -0700Hi,On 2018-04-12 10:09:16 +1000, Dave Chinner wrote:To pound the broken record: there are many good reasons why Linuxfilesystem developers have said "you should use direct IO" to the PGdevs each time we have this "the kernel doesn't do [complex thingsPG needs]" discussion.I personally am on board with doing that. But you also gotta recognizethat an efficient DIO usage is a metric ton of work, and you need alarge amount of differing logic for different platforms. It's just notrealistic to do so for every platform. Postgres is developed by a smallnumber of people, isn't VC backed etc. The amount of resources we canthrow at something is fairly limited. I'm hoping to work on addinglinux DIO support to pg, but I'm sure as hell not going to do be able todo the same on windows (solaris, hpux, aix, ...) etc.And there's cases where that just doesn't help at all. Being able tountar a database from backup / archive / timetravel / whatnot, and thenfsyncing the directory tree to make sure it's actually safe, is reallynot an insane idea. Or even just cp -r ing it, and then starting up acopy of the database. What you're saying is that none of that is doablein a safe way, unless you use special-case DIO using tooling for thewhole operation (or at least tools that fsync carefully without everclosing a fd, which certainly isn't the case for cp et al).In this case, robust IO error reporting is easy with DIO. It's oneof the reasons most of the high performance database engines areeither using or moving to non-blocking AIO+DIO (RWF_NOWAIT) and useO_DSYNC/RWF_DSYNC for integrity-critical IO dispatch. This is alsobeing driven by the availability of high performance, high IOPSsolid state storage where buffering in RAM to optimise IO patternsand throughput provides no real performance benefit.Using the AIO+DIO infrastructure ensures errors are reported for thespecific write that fails at failure time (i.e. in the aiocompletion event for the specific IO), yet high IO throughput can bemaintained without the application needing it's own threadinginfrastructure to prevent blocking.This means the application doesn't have to guess where the writeerror occurred to retry/recover, have to handle async write errorson close(), have to use fsync() to gather write IO errors and theninfer where the IO failure was, or require kernels on everysupported platform to jump through hoops to try to do exactly theright thing in error conditions for everyone in all circumstances atall times....Most of that sounds like a good thing to do, but you got to recognizethat that's a lot of linux specific code.From: Andres Freund <andres@...razel.de>Date: Wed, 11 Apr 2018 19:51:13 -0700Hi,On 2018-04-11 19:32:21 -0700, Andres Freund wrote:And there's cases where that just doesn't help at all. Being able tountar a database from backup / archive / timetravel / whatnot, and thenfsyncing the directory tree to make sure it's actually safe, is reallynot an insane idea. Or even just cp -r ing it, and then starting up acopy of the database. What you're saying is that none of that is doablein a safe way, unless you use special-case DIO using tooling for thewhole operation (or at least tools that fsync carefully without everclosing a fd, which certainly isn't the case for cp et al).And before somebody argues that that's a too small window to trigger theproblem realistically: Restoring large databases happens pretty commonly(for new replicas, testcases, or actual fatal issues), takes time, andit's where a lot of storage is actually written to for the first time ina while, so it's far from unlikely to trigger bad block errors or such.From: Matthew Wilcox <willy@...radead.org>Date: Wed, 11 Apr 2018 20:02:48 -0700On Wed, Apr 11, 2018 at 07:17:52PM -0700, Andres Freund wrote:While there's some differing opinions on the referenced postgres thread,the fundamental problem isn't so much that a retry won't fix theproblem, it's that we might NEVER see the failure. If writeback happensin the background, encounters an error, undirties the buffer, we willhappily carry on because we've never seen that. That's when we'remajorly screwed.I think there are two issues here - "fsync() on an fd that was just opened"and "persistent error state (without keeping dirty pages in memory)".If there is background data writeback without an open file descriptor,there is no mechanism for the kernel to return an error to any applicationwhich may exist, or may not ever come back.And that's horrible. If I cp a file, and writeback fails in thebackground, and I then cat that file before restarting, I should be ableto see that that failed. Instead of returning something bogus.At the moment, when we open a file, we sample the current state of thewriteback error and only report new errors. We could set it to zeroinstead, and report the most recent error as soon as anything happenswhich would report an error. That way err = close(open("file")); wouldreport the most recent error.That's not going to be persistent across the data structure for that inodebeing removed from memory; we'd need filesystem support for persistingthat. But maybe it's "good enough" to only support it for recent files.Jeff, what do you think?From: "Theodore Y. Ts'o" <tytso@....edu>Date: Thu, 12 Apr 2018 01:09:24 -0400On Wed, Apr 11, 2018 at 07:32:21PM -0700, Andres Freund wrote:Most of that sounds like a good thing to do, but you got to recognizethat that's a lot of linux specific code.I know it's not what PG has chosen, but realistically all of the othermajor databases and userspace based storage systems have used DIOprecisely because it's the way to avoid OS-specific behavior orrequire OS-specific code. DIO is simple, and pretty much the sameeverywhere.In contrast, the exact details of how buffered I/O workrs can be quitedifferent on different OS's. This is especially true if you takeperformance related details (e.g., the cleaning algorithm, how pagesget chosen for eviction, etc.)As I read the PG-hackers thread, I thought I saw acknowledgement thatsome of the behaviors you don't like with Linux also show up on otherUnix or Unix-like systems?From: "Theodore Y. Ts'o" <tytso@....edu>Date: Thu, 12 Apr 2018 01:34:45 -0400On Wed, Apr 11, 2018 at 07:17:52PM -0700, Andres Freund wrote:If there is background data writeback without an open file descriptor,there is no mechanism for the kernel to return an error to any applicationwhich may exist, or may not ever come back.And that's horrible. If I cp a file, and writeback fails in thebackground, and I then cat that file before restarting, I should be ableto see that that failed. Instead of returning something bogus.If there is no open file descriptor, and in many cases, no process(because it has already exited), it may be horrible, but what the h*llelse do you expect the OS to do?The solution we use at Google is that we watch for I/O errors using acompletely different process that is responsible for monitoringmachine health. It used to scrape dmesg, but we now arrange to haveI/O errors get sent via a netlink channel to the machine healthmonitoring daemon. If it detects errors on a particular hard drive,it tells the cluster file system to stop using that disk, and toreconstruct from erasure code all of the data chunks on that disk ontoother disks in the cluster. We then run a series of disk diagnosticsto make sure we find all of the bad sectors (every often, where thereis one bad sector, there are several more waiting to be found), andthen afterwards, put the disk back into service.By making it be a separate health monitoring process, we can have HDDexperts write much more sophisticated code that can ask the diskfirmware for more information (e.g., SMART, the grown defect list), domuch more careful scrubbing of the disk media, etc., before returningthe disk back to service.Everyone already has too much work to do, so you need to find someonewho has an interest in fixing this (IMHO very peculiar) use case. IfPG developers want to add a tunable "keep dirty pages in RAM on IOfailure", I don't think that it would be too hard for someone to do.It might be harder to convince some of the kernel maintainers toaccept it, and I've been on the losing side of that battle more thanonce. However, like everything you don't pay for, you can't requiresomeone else to do this for you. It wouldn't hurt to see if JeffLayton, who wrote the errseq patches, would be interested to work onsomething like this.I don't think this is that PG specific, as explained above.The reality is that recovering from disk errors is tricky business,and I very much doubt most userspace applications, including distropackage managers, are going to want to engineer for trying to detectand recover from disk errors. If that were true, then Red Hat and/orSuSE have kernel engineers, and they would have implemented everythingeverything on your wish list. They haven't, and that should tell yousomething.The other reality is that once a disk starts developing errors, inreality you will probably need to take the disk off-line, scrub it tofind any other media errors, and there's a good chance you'll need torewrite bad sectors (incluing some which are on top of file systemmetadata, so you probably will have to run fsck or reformat the wholefile system). I certainly don't think it's realistic to assume addinglots of sophistication to each and every userspace program.If you have tens or hundreds of thousands of disk drives, then youwill need to do tsomething automated, but I claim that you reallydon't want to smush all of that detailed exception handling and HDDrepair technology into each database or cluster file system component.It really needs to be done in a separate health-monitor andmachine-level management system.From: Dave Chinner <david@...morbit.com>Date: Thu, 12 Apr 2018 15:45:36 +1000On Wed, Apr 11, 2018 at 07:32:21PM -0700, Andres Freund wrote:Hi,On 2018-04-12 10:09:16 +1000, Dave Chinner wrote:To pound the broken record: there are many good reasons why Linuxfilesystem developers have said "you should use direct IO" to the PGdevs each time we have this "the kernel doesn't do <complex thingsPG needs>" discussion.I personally am on board with doing that. But you also gotta recognizethat an efficient DIO usage is a metric ton of work, and you need alarge amount of differing logic for different platforms. It's just notrealistic to do so for every platform. Postgres is developed by a smallnumber of people, isn't VC backed etc. The amount of resources we canthrow at something is fairly limited. I'm hoping to work on addinglinux DIO support to pg, but I'm sure as hell not going to do be able todo the same on windows (solaris, hpux, aix, ...) etc.And there's cases where that just doesn't help at all. Being able tountar a database from backup / archive / timetravel / whatnot, and thenfsyncing the directory tree to make sure it's actually safe, is reallynot an insane idea.Yes it is.This is what syncfs() is for - making sure a large amount of of dataand metadata spread across many files and subdirectories in a singlefilesystem is pushed to stable storage in the most efficient mannerpossible.Or even just cp -r ing it, and then starting up acopy of the database. What you're saying is that none of that is doablein a safe way, unless you use special-case DIO using tooling for thewhole operation (or at least tools that fsync carefully without everclosing a fd, which certainly isn't the case for cp et al).No, Just saying fsyncing individual files and directories is aboutthe most inefficient way you could possible go about doing this.From: Lukas Czerner <lczerner@...hat.com>Date: Thu, 12 Apr 2018 12:19:26 +0200On Wed, Apr 11, 2018 at 07:32:21PM -0700, Andres Freund wrote:And there's cases where that just doesn't help at all. Being able tountar a database from backup / archive / timetravel / whatnot, and thenfsyncing the directory tree to make sure it's actually safe, is reallynot an insane idea. Or even just cp -r ing it, and then starting up acopy of the database. What you're saying is that none of that is doablein a safe way, unless you use special-case DIO using tooling for thewhole operation (or at least tools that fsync carefully without everclosing a fd, which certainly isn't the case for cp et al).Does not seem like a problem to me, just checksum the thing if youreally need to be extra safe. You should probably be doing it anyway ifyou backup / archive / timetravel / whatnot.From: Jeff Layton <jlayton@...hat.com>Date: Thu, 12 Apr 2018 07:09:14 -0400On Wed, 2018-04-11 at 20:02 -0700, Matthew Wilcox wrote:On Wed, Apr 11, 2018 at 07:17:52PM -0700, Andres Freund wrote:While there's some differing opinions on the referenced postgres thread,the fundamental problem isn't so much that a retry won't fix theproblem, it's that we might NEVER see the failure. If writeback happensin the background, encounters an error, undirties the buffer, we willhappily carry on because we've never seen that. That's when we'remajorly screwed.I think there are two issues here - "fsync() on an fd that was just opened"and "persistent error state (without keeping dirty pages in memory)".If there is background data writeback without an open file descriptor,there is no mechanism for the kernel to return an error to any applicationwhich may exist, or may not ever come back.And that's horrible. If I cp a file, and writeback fails in thebackground, and I then cat that file before restarting, I should be ableto see that that failed. Instead of returning something bogus.What are you expecting to happen in this case? Are you expecting a readerror due to a writeback failure? Or are you just saying that we shouldbe invalidating pages that failed to be written back, so that they canbe re-read?At the moment, when we open a file, we sample the current state of thewriteback error and only report new errors. We could set it to zeroinstead, and report the most recent error as soon as anything happenswhich would report an error. That way err = close(open("file")); wouldreport the most recent error.That's not going to be persistent across the data structure for that inodebeing removed from memory; we'd need filesystem support for persistingthat. But maybe it's "good enough" to only support it for recent files.Jeff, what do you think?I hate it :). We could do that, but....yecchhhh.Reporting errors only in the case where the inode happened to stickaround in the cache seems too unreliable for real-world usage, and mightbe problematic for some use cases. I'm also not sure it would really behelpful.I think the crux of the matter here is not really about error reporting,per-se. I asked this at LSF last year, and got no real answer:When there is a writeback error, what should be done with the dirtypage(s)? Right now, we usually just mark them clean and carry on. Isthat the right thing to do?One possibility would be to invalidate the range that failed to bewritten (or the whole file) and force the pages to be faulted in againon the next access. It could be surprising for some applications to notsee the results of their writes on a subsequent read after such anevent.Maybe that's ok in the face of a writeback error though? IDK.From: Matthew Wilcox <willy@...radead.org>Date: Thu, 12 Apr 2018 04:19:48 -0700On Thu, Apr 12, 2018 at 07:09:14AM -0400, Jeff Layton wrote:On Wed, 2018-04-11 at 20:02 -0700, Matthew Wilcox wrote:At the moment, when we open a file, we sample the current state of thewriteback error and only report new errors. We could set it to zeroinstead, and report the most recent error as soon as anything happenswhich would report an error. That way err = close(open("file")); wouldreport the most recent error.That's not going to be persistent across the data structure for that inodebeing removed from memory; we'd need filesystem support for persistingthat. But maybe it's "good enough" to only support it for recent files.Jeff, what do you think?I hate it :). We could do that, but....yecchhhh.Reporting errors only in the case where the inode happened to stickaround in the cache seems too unreliable for real-world usage, and mightbe problematic for some use cases. I'm also not sure it would really behelpful.Yeah, it's definitely half-arsed. We could make further changes toimprove the situation, but they'd have wider impact. For example, we cantell if the error has been sampled by any existing fd, so we could biasour inode reaping to have inodes with unreported errors stick around inthe cache for longer.I think the crux of the matter here is not really about error reporting,per-se. I asked this at LSF last year, and got no real answer:When there is a writeback error, what should be done with the dirtypage(s)? Right now, we usually just mark them clean and carry on. Isthat the right thing to do?I suspect it isn't. If there's a transient error then we should reattemptthe write. OTOH if the error is permanent then reattempting the writeisn't going to do any good and it's just going to cause the drive to gothrough the whole error handling dance again. And what do we do if we'relow on memory and need these pages back to avoid going OOM? There's alot of options here, all of them bad in one situation or another.One possibility would be to invalidate the range that failed to bewritten (or the whole file) and force the pages to be faulted in againon the next access. It could be surprising for some applications to notsee the results of their writes on a subsequent read after such anevent.Maybe that's ok in the face of a writeback error though? IDK.I don't know either. It'd force the application to face up to the factthat the data is gone immediately rather than only finding it out aftera reboot. Again though that might cause more problems than it solves.It's hard to know what the right thing to do is.From: Jeff Layton <jlayton@...hat.com>Date: Thu, 12 Apr 2018 07:24:12 -0400On Thu, 2018-04-12 at 15:45 +1000, Dave Chinner wrote:On Wed, Apr 11, 2018 at 07:32:21PM -0700, Andres Freund wrote:Hi,On 2018-04-12 10:09:16 +1000, Dave Chinner wrote:To pound the broken record: there are many good reasons why Linuxfilesystem developers have said "you should use direct IO" to the PGdevs each time we have this "the kernel doesn't do <complex thingsPG needs>" discussion.I personally am on board with doing that. But you also gotta recognizethat an efficient DIO usage is a metric ton of work, and you need alarge amount of differing logic for different platforms. It's just notrealistic to do so for every platform. Postgres is developed by a smallnumber of people, isn't VC backed etc. The amount of resources we canthrow at something is fairly limited. I'm hoping to work on addinglinux DIO support to pg, but I'm sure as hell not going to do be able todo the same on windows (solaris, hpux, aix, ...) etc.And there's cases where that just doesn't help at all. Being able tountar a database from backup / archive / timetravel / whatnot, and thenfsyncing the directory tree to make sure it's actually safe, is reallynot an insane idea.Yes it is.This is what syncfs() is for - making sure a large amount of of dataand metadata spread across many files and subdirectories in a singlefilesystem is pushed to stable storage in the most efficient mannerpossible.Just note that the error return from syncfs is somewhat iffy. It doesn'tnecessarily return an error when one inode fails to be written back. Ithink it mainly returns errors when you get a metadata writeback error.Or even just cp -r ing it, and then starting up acopy of the database. What you're saying is that none of that is doablein a safe way, unless you use special-case DIO using tooling for thewhole operation (or at least tools that fsync carefully without everclosing a fd, which certainly isn't the case for cp et al).No, Just saying fsyncing individual files and directories is aboutthe most inefficient way you could possible go about doing this.You can still use syncfs but what you'd probably have to do is callsyncfs while you still hold all of the fd's open, and then fsync eachone afterward to ensure that they all got written back properly. Thatshould work as you'd expect.From: Dave Chinner <david@...morbit.com>Date: Thu, 12 Apr 2018 22:01:22 +1000On Thu, Apr 12, 2018 at 07:09:14AM -0400, Jeff Layton wrote:When there is a writeback error, what should be done with the dirtypage(s)? Right now, we usually just mark them clean and carry on. Isthat the right thing to do?There isn't a right thing. Whatever we do will be wrong for someone.One possibility would be to invalidate the range that failed to bewritten (or the whole file) and force the pages to be faulted in againon the next access. It could be surprising for some applications to notsee the results of their writes on a subsequent read after such anevent.Not to mention a POSIX IO ordering violation. Seeing stale dataafter a "successful" write is simply not allowed.Maybe that's ok in the face of a writeback error though? IDK.No matter what we do for async writeback error handling, it will beslightly different from filesystem to filesystem, not to mention OSto OS. The is no magic bullet here, so I'm not sure we should worrytoo much. There's direct IO for anyone who cares that need to knowabout the completion status of every single write IO....From: "Theodore Y. Ts'o" <tytso@....edu>Date: Thu, 12 Apr 2018 11:16:46 -0400On Thu, Apr 12, 2018 at 10:01:22PM +1000, Dave Chinner wrote:On Thu, Apr 12, 2018 at 07:09:14AM -0400, Jeff Layton wrote:When there is a writeback error, what should be done with the dirtypage(s)? Right now, we usually just mark them clean and carry on. Isthat the right thing to do?There isn't a right thing. Whatever we do will be wrong for someone.That's the problem. The best that could be done (and it's not enough)would be to have a mode which does with the PG folks want (or whatthey think they want). It seems what they want is to have an errorresult in the page being marked clean. When they discover the outcome(OOM-city and the unability to unmount a file system on a faileddrive), then they will complain to us again, at which point we cantell them that want they really want is another variation on O_PONIES,and welcome to the real world and real life.Which is why, even if they were to pay someone to implement what theywant, I'm not sure we would want to accept it upstream --- or distro'smight consider it a support nightmare, and refuse to allow that modeto be enabled on enterprise distro's. But at least, it will have beensome PG-based company who will have implemented it, so they're notwasting other people's time or other people's resources...We could try to get something like what Google is doing upstream,which is to have the I/O errors sent to userspace via a netlinkchannel (without changing anything else about how buffered writebackis handled in the face of errors). Then userspace applications couldswitch to Direct I/O like all of the other really serious userspacestorage solutions I'm aware of, and then someone could try to writesome kind of HDD health monitoring system that tries to do the rightthing when a disk is discovered to have developed some media errors orsomething more serious (e.g., a head failure). That plus some kind ofRAID solution is I think the only thing which is really realistic fora typical PG site.It's certainly that's what I would do if I didn't decide to use ahosted cloud solution, such as Cloud SQL for Postgres, and let someoneelse solve the really hard problems of dealing with real-world HDDfailures. :-)From: Jeff Layton <jlayton@...hat.com>Date: Thu, 12 Apr 2018 11:08:50 -0400On Thu, 2018-04-12 at 22:01 +1000, Dave Chinner wrote:On Thu, Apr 12, 2018 at 07:09:14AM -0400, Jeff Layton wrote:When there is a writeback error, what should be done with the dirtypage(s)? Right now, we usually just mark them clean and carry on. Isthat the right thing to do?There isn't a right thing. Whatever we do will be wrong for someone.One possibility would be to invalidate the range that failed to bewritten (or the whole file) and force the pages to be faulted in againon the next access. It could be surprising for some applications to notsee the results of their writes on a subsequent read after such anevent.Not to mention a POSIX IO ordering violation. Seeing stale dataafter a "successful" write is simply not allowed.I'm not so sure here, given that we're dealing with an error condition.Are we really obligated not to allow any changes to pages that we can'twrite back?Given that the pages are clean after these failures, we aren't doingthis even today:Suppose we're unable to do writes but can do reads vs. the backingstore. After a wb failure, the page has the dirty bit cleared. If itgets kicked out of the cache before the read occurs, it'll have to befaulted back in. Poof -- your write just disappeared.That can even happen before you get the chance to call fsync, so even awrite()+read()+fsync() is not guaranteed to be safe in this regardtoday, given sufficient memory pressure.I think the current situation is fine from a "let's not OOM at allcosts" standpoint, but not so good for application predictability. Weshould really consider ways to do better here.Maybe that's ok in the face of a writeback error though? IDK.No matter what we do for async writeback error handling, it will beslightly different from filesystem to filesystem, not to mention OSto OS. The is no magic bullet here, so I'm not sure we should worrytoo much. There's direct IO for anyone who cares that need to knowabout the completion status of every single write IO....I think we we have an opportunity here to come up with better definedand hopefully more useful behavior for buffered I/O in the face ofwriteback errors. The first step would be to hash out what we'd want itto look like.Maybe we need a plenary session at LSF/MM?From: Andres Freund <andres@...razel.de>Date: Thu, 12 Apr 2018 12:46:27 -0700Hi,On 2018-04-12 12:19:26 +0200, Lukas Czerner wrote:On Wed, Apr 11, 2018 at 07:32:21PM -0700, Andres Freund wrote:And there's cases where that just doesn't help at all. Being able tountar a database from backup / archive / timetravel / whatnot, and thenfsyncing the directory tree to make sure it's actually safe, is reallynot an insane idea. Or even just cp -r ing it, and then starting up acopy of the database. What you're saying is that none of that is doablein a safe way, unless you use special-case DIO using tooling for thewhole operation (or at least tools that fsync carefully without everclosing a fd, which certainly isn't the case for cp et al).Does not seem like a problem to me, just checksum the thing if youreally need to be extra safe. You should probably be doing it anyway ifyou backup / archive / timetravel / whatnot.That doesn't really help, unless you want to sync() and then re-read allthe data to make sure it's the same. Rereading multi-TB backups just toknow whether there was an error that the OS knew about isn'tparticularly fun. Without verifying after sync it's not going to improvethe situation measurably, you're still only going to discover that $dataisn't available when it's needed.What you're saying here is that there's no way to use standard linuxtools to manipulate files and know whether it failed, without filteringkernel logs for IO errors. Or am I missing something?From: Andres Freund <andres@...razel.de>Date: Thu, 12 Apr 2018 12:55:36 -0700Hi,On 2018-04-12 01:34:45 -0400, Theodore Y. Ts'o wrote:The solution we use at Google is that we watch for I/O errors using acompletely different process that is responsible for monitoringmachine health. It used to scrape dmesg, but we now arrange to haveI/O errors get sent via a netlink channel to the machine healthmonitoring daemon.Any pointers to that the underling netlink mechanism? If we can forcepostgres to kill itself when such an error is detected (via a dedicatedmonitoring process), I'd personally be happy enough. It'd be nicer ifwe could associate that knowledge with particular filesystems etc(which'd possibly hard through dm etc?), but this'd be much better thannothing.The reality is that recovering from disk errors is tricky business,and I very much doubt most userspace applications, including distropackage managers, are going to want to engineer for trying to detectand recover from disk errors. If that were true, then Red Hat and/orSuSE have kernel engineers, and they would have implemented everythingeverything on your wish list. They haven't, and that should tell yousomething.The problem really isn't about recovering from disk errors. Knowingabout them is the crucial part. We do not want to give back clients theinformation that an operation succeeded, when it actually didn't. Therecould be improvements above that, but as long as it's guaranteed that"we" get the error (rather than just some kernel log we don't haveaccess to, which looks different due to config etc), it's ok. We canthrow our hands up in the air and give up.The other reality is that once a disk starts developing errors, inreality you will probably need to take the disk off-line, scrub it tofind any other media errors, and there's a good chance you'll need torewrite bad sectors (incluing some which are on top of file systemmetadata, so you probably will have to run fsck or reformat the wholefile system). I certainly don't think it's realistic to assume addinglots of sophistication to each and every userspace program.If you have tens or hundreds of thousands of disk drives, then youwill need to do tsomething automated, but I claim that you reallydon't want to smush all of that detailed exception handling and HDDrepair technology into each database or cluster file system component.It really needs to be done in a separate health-monitor andmachine-level management system.Yea, agreed on all that. I don't think anybody actually involved inpostgres wants to do anything like that. Seems far outside of postgres'remit.From: Andres Freund <andres@...razel.de>Date: Thu, 12 Apr 2018 13:13:22 -0700Hi,On 2018-04-12 11:16:46 -0400, Theodore Y. Ts'o wrote:That's the problem. The best that could be done (and it's not enough)would be to have a mode which does with the PG folks want (or whatthey think they want). It seems what they want is to have an errorresult in the page being marked clean. When they discover the outcome(OOM-city and the unability to unmount a file system on a faileddrive), then they will complain to us again, at which point we cantell them that want they really want is another variation on O_PONIES,and welcome to the real world and real life.I think a per-file or even per-blockdev/fs error state that'd bereturned by fsync() would be more than sufficient. I don't see thatthat'd realistically would trigger OOM or the inability to unmount afilesystem. If the drive is entirely gone there's obviously no point inkeeping per-file information around, so per-blockdev/fs informationsuffices entirely to return an error on fsync (which at least on ext4appears to happen if the underlying blockdev is gone).Have fun making up things we want, but I'm not sure it's particularlyproductive.Which is why, even if they were to pay someone to implement what theywant, I'm not sure we would want to accept it upstream --- or distro'smight consider it a support nightmare, and refuse to allow that modeto be enabled on enterprise distro's. But at least, it will have beensome PG-based company who will have implemented it, so they're notwasting other people's time or other people's resources...Well, that's why I'm discussing here so we can figure out what'sacceptable before considering wasting money and revew cycles doing orpaying somebody to do some crazy useless shit.We could try to get something like what Google is doing upstream,which is to have the I/O errors sent to userspace via a netlinkchannel (without changing anything else about how buffered writebackis handled in the face of errors).Ah, darn. After you'd mentioned that in an earlier mail I'd hoped that'dbe upstream. And yes, that'd be perfect.Then userspace applications could switch to Direct I/O like all of theother really serious userspace storage solutions I'm aware of, andthen someone could try to write some kind of HDD health monitoringsystem that tries to do the right thing when a disk is discovered tohave developed some media errors or something more serious (e.g., ahead failure). That plus some kind of RAID solution is I think theonly thing which is really realistic for a typical PG site.As I said earlier, I think there's good reason to move to DIO forpostgres. But to keep that performant is going to need some seriouswork.But afaict such a solution wouldn't really depend on applications usingDIO or not. Before finishing a checkpoint (logging it persistently andallowing to throw older data away), we could check if any errors havebeen reported and give up if there have been any. And after startingpostgres on a directory restored from backup using $tool, we can fsyncthe directory recursively, check for such errors, and give up ifthere've been any.From: Andres Freund <andres@...razel.de>Date: Thu, 12 Apr 2018 13:24:57 -0700On 2018-04-12 07:09:14 -0400, Jeff Layton wrote:On Wed, 2018-04-11 at 20:02 -0700, Matthew Wilcox wrote:On Wed, Apr 11, 2018 at 07:17:52PM -0700, Andres Freund wrote:While there's some differing opinions on the referenced postgres thread,the fundamental problem isn't so much that a retry won't fix theproblem, it's that we might NEVER see the failure. If writeback happensin the background, encounters an error, undirties the buffer, we willhappily carry on because we've never seen that. That's when we'remajorly screwed.I think there are two issues here - "fsync() on an fd that was just opened"and "persistent error state (without keeping dirty pages in memory)".If there is background data writeback without an open file descriptor,there is no mechanism for the kernel to return an error to any applicationwhich may exist, or may not ever come back.And that's horrible. If I cp a file, and writeback fails in thebackground, and I then cat that file before restarting, I should be ableto see that that failed. Instead of returning something bogus.What are you expecting to happen in this case? Are you expecting a readerror due to a writeback failure? Or are you just saying that we shouldbe invalidating pages that failed to be written back, so that they canbe re-read?Yes, I'd hope for a read error after a writeback failure. I think that'ssane behaviour. But I don't really care that much.At the very least some way to know that such a failure occurred fromuserland without having to parse the kernel log. As far as I understand,neither sync(2) (and thus sync(1)) nor syncfs(2) is guaranteed to reportan error if it was encountered by writeback in the background.If that's indeed true for syncfs(2), even if the fd has been openedbefore (which I can see how it could happen from an implementation POV,nothing would associate a random FD with failures on different files),it's really impossible to detect this stuff from userland without textparsing.Even if it'd were just a perf-fs /sys/$something file that'd return thecurrent count of unreported errors in a filesystem independent way, it'dbe better than what we have right now.1) figure out /sys/$whatnot $directory belongs to2) oldcount=$(cat /sys/$whatnot/unreported_errors)3) filesystem operations in $directory4) sync;sync;5) newcount=$(cat /sys/$whatnot/unreported_errors)6) test "$oldcount" -eq "$newcount" || die-with-horrible-messageIsn't beautiful to script, but it's also not absolutely terrible.From: Matthew Wilcox <willy@...radead.org>Date: Thu, 12 Apr 2018 13:28:30 -0700On Thu, Apr 12, 2018 at 01:13:22PM -0700, Andres Freund wrote:I think a per-file or even per-blockdev/fs error state that'd bereturned by fsync() would be more than sufficient.Ah; this was my suggestion to Jeff on IRC. That we add a per-superblockwb_err and then allow syncfs() to return it. So you'd open an fd ona directory (for example), and call syncfs() which would return -EIOor -ENOSPC if either of those conditions had occurred since you openedthe fd.I don't see thatthat'd realistically would trigger OOM or the inability to unmount afilesystem.Ted's referring to the current state of affairs where the writeback erroris held in the inode; if we can't evict the inode because it's holdingthe error indicator, that can send us OOM. If instead we transfer theerror indicator to the superblock, then there's no problem.From: Andres Freund <andres@...razel.de>Date: Thu, 12 Apr 2018 14:11:45 -0700On 2018-04-12 07:24:12 -0400, Jeff Layton wrote:On Thu, 2018-04-12 at 15:45 +1000, Dave Chinner wrote:On Wed, Apr 11, 2018 at 07:32:21PM -0700, Andres Freund wrote:Hi,On 2018-04-12 10:09:16 +1000, Dave Chinner wrote:> To pound the broken record: there are many good reasons why Linux> filesystem developers have said "you should use direct IO" to the PG> devs each time we have this "the kernel doesn't do <complex things> PG needs>" discussion.I personally am on board with doing that. But you also gotta recognizethat an efficient DIO usage is a metric ton of work, and you need alarge amount of differing logic for different platforms. It's just notrealistic to do so for every platform. Postgres is developed by a smallnumber of people, isn't VC backed etc. The amount of resources we canthrow at something is fairly limited. I'm hoping to work on addinglinux DIO support to pg, but I'm sure as hell not going to do be able todo the same on windows (solaris, hpux, aix, ...) etc.And there's cases where that just doesn't help at all. Being able tountar a database from backup / archive / timetravel / whatnot, and thenfsyncing the directory tree to make sure it's actually safe, is reallynot an insane idea.Yes it is.This is what syncfs() is for - making sure a large amount of of dataand metadata spread across many files and subdirectories in a singlefilesystem is pushed to stable storage in the most efficient mannerpossible.syncfs isn't standardized, it operates on an entire filesystem (thuswriting out unnecessary stuff), it has no meaningful documentation ofit's return codes. Yes, using syncfs() might better performancewise,but it doesn't seem like it actually solves anything, performance aside:Just note that the error return from syncfs is somewhat iffy. It doesn'tnecessarily return an error when one inode fails to be written back. Ithink it mainly returns errors when you get a metadata writeback error.You can still use syncfs but what you'd probably have to do is callsyncfs while you still hold all of the fd's open, and then fsync eachone afterward to ensure that they all got written back properly. Thatshould work as you'd expect.Which again doesn't allow one to use any non-bespoke tooling (like taror whatnot). And it means you'll have to call syncfs() every few hundredfiles, because you'll obviously run into filehandle limitations.From: Jeff Layton <jlayton@...hat.com>Date: Thu, 12 Apr 2018 17:14:54 -0400On Thu, 2018-04-12 at 13:28 -0700, Matthew Wilcox wrote:On Thu, Apr 12, 2018 at 01:13:22PM -0700, Andres Freund wrote:I think a per-file or even per-blockdev/fs error state that'd bereturned by fsync() would be more than sufficient.Ah; this was my suggestion to Jeff on IRC. That we add a per-superblockwb_err and then allow syncfs() to return it. So you'd open an fd ona directory (for example), and call syncfs() which would return -EIOor -ENOSPC if either of those conditions had occurred since youopenedthe fd.Not a bad idea and shouldn't be too costly. mapping_set_error couldflag the superblock one before or after the one in the mapping.We'd need to define what happens if you interleave fsync and syncfscalls on the same inode though. How do we handle file->f_wb_err in thatcase? Would we need a second field in struct file to act as the per-sberror cursor?I don't see thatthat'd realistically would trigger OOM or the inability to unmountafilesystem.Ted's referring to the current state of affairs where the writebackerroris held in the inode; if we can't evict the inode because it'sholdingthe error indicator, that can send us OOM. If instead we transfertheerror indicator to the superblock, then there's no problem.From: "Theodore Y. Ts'o" <tytso@....edu>Date: Thu, 12 Apr 2018 17:21:44 -0400On Thu, Apr 12, 2018 at 01:28:30PM -0700, Matthew Wilcox wrote:On Thu, Apr 12, 2018 at 01:13:22PM -0700, Andres Freund wrote:I think a per-file or even per-blockdev/fs error state that'd bereturned by fsync() would be more than sufficient.Ah; this was my suggestion to Jeff on IRC. That we add a per-superblockwb_err and then allow syncfs() to return it. So you'd open an fd ona directory (for example), and call syncfs() which would return -EIOor -ENOSPC if either of those conditions had occurred since you openedthe fd.When or how would the per-superblock wb_err flag get cleared?Would all subsequent fsync() calls on that file system now return EIO?Or would only all subsequent syncfs() calls return EIO?I don't see thatthat'd realistically would trigger OOM or the inability to unmount afilesystem.Ted's referring to the current state of affairs where the writeback erroris held in the inode; if we can't evict the inode because it's holdingthe error indicator, that can send us OOM. If instead we transfer theerror indicator to the superblock, then there's no problem.Actually, I was referring to the pg-hackers original ask, which wasthat after an error, all of the dirty pages that couldn't be writtenout would stay dirty.If it's only as single inode which is pinned in memory with the dirtyflag, that's bad, but it's not as bad as pinning all of the memorypages for which there was a failed write. We would still need toinvent some mechanism or define some semantic when it would be OK toclear the per-inode flag and let the memory associated with thatpinned inode get released, though.From: Matthew Wilcox <willy@...radead.org>Date: Thu, 12 Apr 2018 14:24:32 -0700On Thu, Apr 12, 2018 at 05:21:44PM -0400, Theodore Y. Ts'o wrote:On Thu, Apr 12, 2018 at 01:28:30PM -0700, Matthew Wilcox wrote:On Thu, Apr 12, 2018 at 01:13:22PM -0700, Andres Freund wrote:I think a per-file or even per-blockdev/fs error state that'd bereturned by fsync() would be more than sufficient.Ah; this was my suggestion to Jeff on IRC. That we add a per-superblockwb_err and then allow syncfs() to return it. So you'd open an fd ona directory (for example), and call syncfs() which would return -EIOor -ENOSPC if either of those conditions had occurred since you openedthe fd.When or how would the per-superblock wb_err flag get cleared?That's not how errseq works, Ted ;-)Would all subsequent fsync() calls on that file system now return EIO?Or would only all subsequent syncfs() calls return EIO?Only ones which occur after the last sampling get reported through thisparticular file descriptor.From: Jeff Layton <jlayton@...hat.com>Date: Thu, 12 Apr 2018 17:27:54 -0400On Thu, 2018-04-12 at 13:24 -0700, Andres Freund wrote:On 2018-04-12 07:09:14 -0400, Jeff Layton wrote:On Wed, 2018-04-11 at 20:02 -0700, Matthew Wilcox wrote:On Wed, Apr 11, 2018 at 07:17:52PM -0700, Andres Freund wrote:While there's some differing opinions on the referenced postgres thread,the fundamental problem isn't so much that a retry won't fix theproblem, it's that we might NEVER see the failure. If writeback happensin the background, encounters an error, undirties the buffer, we willhappily carry on because we've never seen that. That's when we'remajorly screwed.I think there are two issues here - "fsync() on an fd that was just opened"and "persistent error state (without keeping dirty pages in memory)".If there is background data writeback without an open file descriptor,there is no mechanism for the kernel to return an error to any applicationwhich may exist, or may not ever come back.And that's horrible. If I cp a file, and writeback fails in thebackground, and I then cat that file before restarting, I should be ableto see that that failed. Instead of returning something bogus.What are you expecting to happen in this case? Are you expecting a readerror due to a writeback failure? Or are you just saying that we shouldbe invalidating pages that failed to be written back, so that they canbe re-read?Yes, I'd hope for a read error after a writeback failure. I think that'ssane behaviour. But I don't really care that much.I'll have to respectfully disagree. Why should I interpret an error ona read() syscall to mean that writeback failed? Note that the data isstill potentially intact.What might make sense, IMO, is to just invalidate the pages thatfailed to be written back. Then you could potentially do a read tofault them in again (i.e. sync the pagecache and the backing store) andpossibly redirty them for another try.Note that you can detect this situation by checking the return codefrom fsync. It should report the latest error once per filedescription.At the very least some way to know that such a failure occurred fromuserland without having to parse the kernel log. As far as I understand,neither sync(2) (and thus sync(1)) nor syncfs(2) is guaranteed to reportan error if it was encountered by writeback in the background.If that's indeed true for syncfs(2), even if the fd has been openedbefore (which I can see how it could happen from an implementation POV,nothing would associate a random FD with failures on different files),it's really impossible to detect this stuff from userland without textparsing.syncfs could use some work.I'm warming to willy's idea to add a per-sb errseq_t. I think thatmight be a simple way to get better semantics here. Not sure how wewant to handle the reporting end yet though...We probably also need to consider how to better track metadatawriteback errors (on e.g. ext2). We don't really do that properly atquite yet either.Even if it'd were just a perf-fs /sys/$something file that'd return thecurrent count of unreported errors in a filesystem independent way, it'dbe better than what we have right now.1) figure out /sys/$whatnot $directory belongs to2) oldcount=$(cat /sys/$whatnot/unreported_errors)3) filesystem operations in $directory4) sync;sync;5) newcount=$(cat /sys/$whatnot/unreported_errors)6) test "$oldcount" -eq "$newcount" || die-with-horrible-messageIsn't beautiful to script, but it's also not absolutely terrible.From: Matthew Wilcox <willy@...radead.org>Date: Thu, 12 Apr 2018 14:31:10 -0700On Thu, Apr 12, 2018 at 05:14:54PM -0400, Jeff Layton wrote:On Thu, 2018-04-12 at 13:28 -0700, Matthew Wilcox wrote:On Thu, Apr 12, 2018 at 01:13:22PM -0700, Andres Freund wrote:I think a per-file or even per-blockdev/fs error state that'd bereturned by fsync() would be more than sufficient.Ah; this was my suggestion to Jeff on IRC. That we add a per-superblockwb_err and then allow syncfs() to return it. So you'd open an fd ona directory (for example), and call syncfs() which would return -EIOor -ENOSPC if either of those conditions had occurred since youopenedthe fd.Not a bad idea and shouldn't be too costly. mapping_set_error couldflag the superblock one before or after the one in the mapping.We'd need to define what happens if you interleave fsync and syncfscalls on the same inode though. How do we handle file->f_wb_err in thatcase? Would we need a second field in struct file to act as the per-sberror cursor?Ooh. I hadn't thought that through. Bleh. I don't want to add a fieldto struct file for this uncommon case.Maybe O_PATH could be used for this? It gets you a file descriptor ona particular filesystem, so syncfs() is defined, but it can't reporta writeback error. So if you open something O_PATH, you can use thefile's f_wb_err for the mapping's error cursor.From: Andres Freund <andres@...razel.de>Date: Thu, 12 Apr 2018 14:37:56 -0700On 2018-04-12 17:21:44 -0400, Theodore Y. Ts'o wrote:On Thu, Apr 12, 2018 at 01:28:30PM -0700, Matthew Wilcox wrote:On Thu, Apr 12, 2018 at 01:13:22PM -0700, Andres Freund wrote:I think a per-file or even per-blockdev/fs error state that'd bereturned by fsync() would be more than sufficient.Ah; this was my suggestion to Jeff on IRC. That we add a per-superblockwb_err and then allow syncfs() to return it. So you'd open an fd ona directory (for example), and call syncfs() which would return -EIOor -ENOSPC if either of those conditions had occurred since you openedthe fd.When or how would the per-superblock wb_err flag get cleared?I don't think unmount + resettable via /sys would be an insaneapproach. Requiring explicit action to acknowledge data loss isn't acrazy concept. But I think that's something reasonable minds coulddisagree with.Would all subsequent fsync() calls on that file system now return EIO?Or would only all subsequent syncfs() calls return EIO?If it were tied to syncfs, I wonder if there's a way to have some errseqtype logic. Store a per superblock (or whatever equivalent thing) errseqvalue of errors. For each fd calling syncfs() report the error once,but then store the current value in a separate per-fd field. And ifthat's considered too weird, only report the errors to fds that havebeen opened from before the error occurred.I can see writing a tool 'pg_run_and_sync /directo /ries -- command'which opens an fd for each of the filesystems the directories reside on,and calls syncfs() after. That'd allow to use backup/restore tools atleast semi safely.I don't see thatthat'd realistically would trigger OOM or the inability to unmount afilesystem.Ted's referring to the current state of affairs where the writeback erroris held in the inode; if we can't evict the inode because it's holdingthe error indicator, that can send us OOM. If instead we transfer theerror indicator to the superblock, then there's no problem.Actually, I was referring to the pg-hackers original ask, which wasthat after an error, all of the dirty pages that couldn't be writtenout would stay dirty.Well, it's an open list, everyone can argue. And initially people atfirst didn't know the OOM explanation, and then it takes some time torevise ones priors :). I think it's a design question that reasonablepeople can disagree upon (if "hot" removed devices are handled bythrowing data away regardless, at least). But as it's clearly notsomething viable, we can move on to something that can solve theproblem.If it's only as single inode which is pinned in memory with the dirtyflag, that's bad, but it's not as bad as pinning all of the memorypages for which there was a failed write. We would still need toinvent some mechanism or define some semantic when it would be OK toclear the per-inode flag and let the memory associated with thatpinned inode get released, though.Yea, I agree that that's not obvious. One way would be to say that it'sonly automatically cleared when you unlink the file. A bit heavyhanded,but not too crazy.From: "Theodore Y. Ts'o" <tytso@....edu>Date: Thu, 12 Apr 2018 17:52:52 -0400On Thu, Apr 12, 2018 at 12:55:36PM -0700, Andres Freund wrote:Any pointers to that the underling netlink mechanism? If we can forcepostgres to kill itself when such an error is detected (via a dedicatedmonitoring process), I'd personally be happy enough. It'd be nicer ifwe could associate that knowledge with particular filesystems etc(which'd possibly hard through dm etc?), but this'd be much better thannothing.Yeah, sorry, it never got upstreamed. It's not really all thatcomplicated, it was just that there were some other folks who wantedto do something similar, and there was a round of bike-sheddinghseveral years ago, and nothing ever went upstream. Part of theproblem was that our orignial scheme sent up information about filesystem-level corruption reports --- e.g, those stemming from calls toext4_error() --- and lots of people had different ideas about how totget all of the possible information up in some structured format.(Think something like uerf from Digtial's OSF/1.)We did something really simple/stupid. We just sent essentially anascii test string out the netlink socket. That's because what we weredoing before was essentially scraping the output of dmesg(e.g. /dev/kmssg).That's actually probably the simplest thing to do, and it has theadvantage that it will work even on ancient enterprise kernels that PGusers are likely to want to use. So you will need to implement thedmesg text scraper anyway, and that's probably good enough for mostuse cases.The problem really isn't about recovering from disk errors. Knowingabout them is the crucial part. We do not want to give back clients theinformation that an operation succeeded, when it actually didn't. Therecould be improvements above that, but as long as it's guaranteed that"we" get the error (rather than just some kernel log we don't haveaccess to, which looks different due to config etc), it's ok. We canthrow our hands up in the air and give up.Right, it's a little challenging because the actual regexp's you wouldneed to use do vary from device driver to device driver. Fortunatelynearly everything is a SCSI/SATA device these days, so there isn'tthat much variability.Yea, agreed on all that. I don't think anybody actually involved inpostgres wants to do anything like that. Seems far outside of postgres'remit.Some people on the pg-hackers list were talking about wanting to retrythe fsync() and hoping that would cause the write to somehow suceed.It's possible that might help, but it's not likely to be helpful inmy experience.From: Andres Freund <andres@...razel.de>Date: Thu, 12 Apr 2018 14:53:19 -0700On 2018-04-12 17:27:54 -0400, Jeff Layton wrote:On Thu, 2018-04-12 at 13:24 -0700, Andres Freund wrote:At the very least some way to know that such a failure occurred fromuserland without having to parse the kernel log. As far as I understand,neither sync(2) (and thus sync(1)) nor syncfs(2) is guaranteed to reportan error if it was encountered by writeback in the background.If that's indeed true for syncfs(2), even if the fd has been openedbefore (which I can see how it could happen from an implementation POV,nothing would associate a random FD with failures on different files),it's really impossible to detect this stuff from userland without textparsing.syncfs could use some work.It's really too bad that it doesn't have a flags argument.We probably also need to consider how to better track metadatawriteback errors (on e.g. ext2). We don't really do that properly atquite yet either.Even if it'd were just a perf-fs /sys/$something file that'd return thecurrent count of unreported errors in a filesystem independent way, it'dbe better than what we have right now.1) figure out /sys/$whatnot $directory belongs to2) oldcount=$(cat /sys/$whatnot/unreported_errors)3) filesystem operations in $directory4) sync;sync;5) newcount=$(cat /sys/$whatnot/unreported_errors)6) test "$oldcount" -eq "$newcount" || die-with-horrible-messageIsn't beautiful to script, but it's also not absolutely terrible.ext4 seems to have something roughly like that(/sys/fs/ext4/$dev/errors_count), and by my reading it already seems tobe incremented from the necessary places. By my reading XFS doesn'tseem to have something similar.Wouldn't be bad to standardize...From: "Theodore Y. Ts'o" <tytso@....edu>Date: Thu, 12 Apr 2018 17:57:56 -0400On Thu, Apr 12, 2018 at 02:53:19PM -0700, Andres Freund wrote:Isn't beautiful to script, but it's also not absolutely terrible.ext4 seems to have something roughly like that(/sys/fs/ext4/$dev/errors_count), and by my reading it already seems tobe incremented from the necessary places.This is only for file system inconsistencies noticed by the kernel.We don't bump that count for data block I/O errors.The same idea could be used on a block device level. It would bepretty simple to maintain a counter for I/O errors, and when the lasterror was detected on a particular device. You could evne break outand track read errors and write errors eparately if that would beuseful.If you don't care what block was bad, but just that some I/O errorhad happened, a counter is definitely the simplest approach, and lesshair to implemnet and use than something like a netlink channel orscraping dmesg....From: Andres Freund <andres@...razel.de>Date: Thu, 12 Apr 2018 15:03:59 -0700Hi,On 2018-04-12 17:52:52 -0400, Theodore Y. Ts'o wrote:We did something really simple/stupid. We just sent essentially anascii test string out the netlink socket. That's because what we weredoing before was essentially scraping the output of dmesg(e.g. /dev/kmssg).That's actually probably the simplest thing to do, and it has theadvantage that it will work even on ancient enterprise kernels that PGusers are likely to want to use. So you will need to implement thedmesg text scraper anyway, and that's probably good enough for mostuse cases.The worst part of that is, as you mention below, needing to handle a lotof different error message formats. I guess it's reasonable enough ifyou control your hardware, but no such luck.Aren't there quite realistic scenarios where one could miss kmsg stylemessages due to it being a ringbuffer?Right, it's a little challenging because the actual regexp's you wouldneed to use do vary from device driver to device driver. Fortunatelynearly everything is a SCSI/SATA device these days, so there isn'tthat much variability.There's also SAN / NAS type stuff - not all of that presents as aSCSI/SATA device, right?Yea, agreed on all that. I don't think anybody actually involved inpostgres wants to do anything like that. Seems far outside of postgres'remit.Some people on the pg-hackers list were talking about wanting to retrythe fsync() and hoping that would cause the write to somehow suceed.It's possible that might help, but it's not likely to be helpful inmy experience.Depends on the type of error and storage. ENOSPC, especially over NFS,has some reasonable chances of being cleared up. And for networked blockstorage it's also not impossible to think of scenarios where that'dwork for EIO.But I think besides hope of clearing up itself, it has the advantagethat it trivially can give some feedback to the user. The user'll getback strerror(ENOSPC) with some decent SQL error code, which'llhopefully cause them to investigate (well, once monitoring detects higherror rates). It's much nicer for the user to type COMMIT; get anappropriate error back etc, than if the database just commits suicide.From: Dave Chinner <david@...morbit.com>Date: Fri, 13 Apr 2018 08:44:04 +1000On Thu, Apr 12, 2018 at 11:08:50AM -0400, Jeff Layton wrote:On Thu, 2018-04-12 at 22:01 +1000, Dave Chinner wrote:On Thu, Apr 12, 2018 at 07:09:14AM -0400, Jeff Layton wrote:When there is a writeback error, what should be done with the dirtypage(s)? Right now, we usually just mark them clean and carry on. Isthat the right thing to do?There isn't a right thing. Whatever we do will be wrong for someone.One possibility would be to invalidate the range that failed to bewritten (or the whole file) and force the pages to be faulted in againon the next access. It could be surprising for some applications to notsee the results of their writes on a subsequent read after such anevent.Not to mention a POSIX IO ordering violation. Seeing stale dataafter a "successful" write is simply not allowed.I'm not so sure here, given that we're dealing with an error condition.Are we really obligated not to allow any changes to pages that we can'twrite back?Posix says this about write(): After a write() to a regular file has successfully returned: Any successful read() from each byte position in the file that was modified by that write shall return the data specified by the write() for that position until such byte positions are again modified.IOWs, even if there is a later error, we told the user the write wassuccessful, and so according to POSIX we are not allowed to windback the data to what it was before the write() occurred.Given that the pages are clean after these failures, we aren't doingthis even today:Suppose we're unable to do writes but can do reads vs. the backingstore. After a wb failure, the page has the dirty bit cleared. If itgets kicked out of the cache before the read occurs, it'll have to befaulted back in. Poof -- your write just disappeared.Yes - I was pointing out what the specification we supposedlyconform to says about this behaviour, not that our current behaviourconforms to the spec. Indeed, have you even noticedxfs_aops_discard_page() and it's surrounding context on pagewriteback submission errors?To save you looking, XFS will trash the page contents completely ona filesystem level ->writepage error. It doesn't mark them "clean",doesn't attempt to redirty and rewrite them - it clears the uptodatestate and may invalidate it completely. IOWs, the data written"sucessfully" to the cached page is now gone. It will be re-readfrom disk on the next read() call, in direct violation of the abovePOSIX requirements.This is my point: we've done that in XFS knowing that we violatePOSIX specifications in this specific corner case - it's the lesserof many evils we have to chose between. Hence if we chose to encodethat behaviour as the general writeback IO error handling algorithm,then it needs to done with the knowledge it is a specificationviolation. Not to mention be documented as a POSIX violation in thevarious relevant man pages and that this is how all filesystems willbehave on async writeback error.....From: Jeff Layton <jlayton@...hat.com>Date: Fri, 13 Apr 2018 08:56:38 -0400On Thu, 2018-04-12 at 14:31 -0700, Matthew Wilcox wrote:On Thu, Apr 12, 2018 at 05:14:54PM -0400, Jeff Layton wrote:On Thu, 2018-04-12 at 13:28 -0700, Matthew Wilcox wrote:On Thu, Apr 12, 2018 at 01:13:22PM -0700, Andres Freund wrote:I think a per-file or even per-blockdev/fs error state that'd bereturned by fsync() would be more than sufficient.Ah; this was my suggestion to Jeff on IRC. That we add a per-superblockwb_err and then allow syncfs() to return it. So you'd open an fd ona directory (for example), and call syncfs() which would return -EIOor -ENOSPC if either of those conditions had occurred since youopenedthe fd.Not a bad idea and shouldn't be too costly. mapping_set_error couldflag the superblock one before or after the one in the mapping.We'd need to define what happens if you interleave fsync and syncfscalls on the same inode though. How do we handle file->f_wb_err in thatcase? Would we need a second field in struct file to act as the per-sberror cursor?Ooh. I hadn't thought that through. Bleh. I don't want to add a fieldto struct file for this uncommon case.Maybe O_PATH could be used for this? It gets you a file descriptor ona particular filesystem, so syncfs() is defined, but it can't reporta writeback error. So if you open something O_PATH, you can use thefile's f_wb_err for the mapping's error cursor.That might work.It'd be a syscall behavioral change so we'd need to document that well.It's probably innocuous though -- I doubt we have a lot of callers inthe field opening files with O_PATH and calling syncfs on them.From: Jeff Layton <jlayton@...hat.com>Date: Fri, 13 Apr 2018 09:18:56 -0400On Fri, 2018-04-13 at 08:44 +1000, Dave Chinner wrote:On Thu, Apr 12, 2018 at 11:08:50AM -0400, Jeff Layton wrote:On Thu, 2018-04-12 at 22:01 +1000, Dave Chinner wrote:On Thu, Apr 12, 2018 at 07:09:14AM -0400, Jeff Layton wrote:When there is a writeback error, what should be done with the dirtypage(s)? Right now, we usually just mark them clean and carry on. Isthat the right thing to do?There isn't a right thing. Whatever we do will be wrong for someone.One possibility would be to invalidate the range that failed to bewritten (or the whole file) and force the pages to be faulted in againon the next access. It could be surprising for some applications to notsee the results of their writes on a subsequent read after such anevent.Not to mention a POSIX IO ordering violation. Seeing stale dataafter a "successful" write is simply not allowed.I'm not so sure here, given that we're dealing with an error condition.Are we really obligated not to allow any changes to pages that we can'twrite back?Posix says this about write():After a write() to a regular file has successfully returned: Any successful read() from each byte position in the file that was modified by that write shall return the data specified by the write() for that position until such byte positions are again modified.IOWs, even if there is a later error, we told the user the write wassuccessful, and so according to POSIX we are not allowed to windback the data to what it was before the write() occurred.Given that the pages are clean after these failures, we aren't doingthis even today:Suppose we're unable to do writes but can do reads vs. the backingstore. After a wb failure, the page has the dirty bit cleared. If itgets kicked out of the cache before the read occurs, it'll have to befaulted back in. Poof -- your write just disappeared.Yes - I was pointing out what the specification we supposedlyconform to says about this behaviour, not that our current behaviourconforms to the spec. Indeed, have you even noticedxfs_aops_discard_page() and it's surrounding context on pagewriteback submission errors?To save you looking, XFS will trash the page contents completely ona filesystem level ->writepage error. It doesn't mark them "clean",doesn't attempt to redirty and rewrite them - it clears the uptodatestate and may invalidate it completely. IOWs, the data written"sucessfully" to the cached page is now gone. It will be re-readfrom disk on the next read() call, in direct violation of the abovePOSIX requirements.This is my point: we've done that in XFS knowing that we violatePOSIX specifications in this specific corner case - it's the lesserof many evils we have to chose between. Hence if we chose to encodethat behaviour as the general writeback IO error handling algorithm,then it needs to done with the knowledge it is a specificationviolation. Not to mention be documented as a POSIX violation in thevarious relevant man pages and that this is how all filesystems willbehave on async writeback error.....Got it, thanks.Yes, I think we ought to probably do the same thing globally. It's niceto know that xfs has already been doing this. That makes me feel betterabout making this behavior the gold standard for Linux filesystems.So to summarize, at this point in the discussion, I think we want toconsider doing the following:better reporting from syncfs (report an error when even one inode failed to be written back since last syncfs call). We'll probably implement this via a per-sb errseq_t in some fashion, though there are some implementation issues to work out.invalidate or clear uptodate flag on pages that experience writeback errors, across filesystems. Encourage this as standard behavior for filesystems and maybe add helpers to make it easier to do this.Did I miss anything? Would that be enough to help the Pg usecase?I don't see us ever being able to reasonably support its currentexpectation that writeback errors will be seen on fd's that were openedafter the error occurred. That's a really thorny problem from an objectlifetime perspective.From: Andres Freund <andres@...razel.de>Date: Fri, 13 Apr 2018 06:25:35 -0700Hi,On 2018-04-13 09:18:56 -0400, Jeff Layton wrote:Yes, I think we ought to probably do the same thing globally. It's niceto know that xfs has already been doing this. That makes me feel betterabout making this behavior the gold standard for Linux filesystems.So to summarize, at this point in the discussion, I think we want toconsider doing the following:better reporting from syncfs (report an error when even one inode failed to be written back since last syncfs call). We'll probably implement this via a per-sb errseq_t in some fashion, though there are some implementation issues to work out.invalidate or clear uptodate flag on pages that experience writeback errors, across filesystems. Encourage this as standard behavior for filesystems and maybe add helpers to make it easier to do this.Did I miss anything? Would that be enough to help the Pg usecase?I don't see us ever being able to reasonably support its currentexpectation that writeback errors will be seen on fd's that were openedafter the error occurred. That's a really thorny problem from an objectlifetime perspective.It's not perfect, but I think the amount of hacky OS specific codeshould be acceptable. And it does allow for a wrapper tool that can beused around backup restores etc to syncfs all the necessary filesystems.Let me mull with others for a bit.From: Matthew Wilcox <willy@...radead.org>Date: Fri, 13 Apr 2018 07:02:32 -0700On Fri, Apr 13, 2018 at 09:18:56AM -0400, Jeff Layton wrote:On Fri, 2018-04-13 at 08:44 +1000, Dave Chinner wrote:To save you looking, XFS will trash the page contents completely ona filesystem level ->writepage error. It doesn't mark them "clean",doesn't attempt to redirty and rewrite them - it clears the uptodatestate and may invalidate it completely. IOWs, the data written"sucessfully" to the cached page is now gone. It will be re-readfrom disk on the next read() call, in direct violation of the abovePOSIX requirements.This is my point: we've done that in XFS knowing that we violatePOSIX specifications in this specific corner case - it's the lesserof many evils we have to chose between. Hence if we chose to encodethat behaviour as the general writeback IO error handling algorithm,then it needs to done with the knowledge it is a specificationviolation. Not to mention be documented as a POSIX violation in thevarious relevant man pages and that this is how all filesystems willbehave on async writeback error.....Got it, thanks.Yes, I think we ought to probably do the same thing globally. It's niceto know that xfs has already been doing this. That makes me feel betterabout making this behavior the gold standard for Linux filesystems.So to summarize, at this point in the discussion, I think we want toconsider doing the following:better reporting from syncfs (report an error when even one inode failed to be written back since last syncfs call). We'll probably implement this via a per-sb errseq_t in some fashion, though there are some implementation issues to work out.invalidate or clear uptodate flag on pages that experience writebackerrors, across filesystems. Encourage this as standard behavior for filesystems and maybe add helpers to make it easier to do this.Did I miss anything? Would that be enough to help the Pg usecase?I don't see us ever being able to reasonably support its currentexpectation that writeback errors will be seen on fd's that were openedafter the error occurred. That's a really thorny problem from an objectlifetime perspective.I think we can do better than XFS is currently doing (but I agree thatwe should have the same behaviour across all Linux filesystems!)If we get an error while wbc->for_background is true, we should not clear uptodate on the page, rather SetPageError and SetPageDirty.Background writebacks should skip pages which are PageError.for_sync writebacks should attempt one last write. Maybe it'll succeed this time. If it does, just ClearPageError. If not, we have somebody to report this writeback error to, and ClearPageUptodate.I think kupdate writes are the same as for_background writes. for_reclaimis tougher. I don't want to see us getting into OOM because we're hangingonto stale data, but we don't necessarily have an open fd to report theerror on. I think I'm leaning towards behaving the same for for_reclaimas for_sync, but this is probably a subject on which reasonable peoplecan disagree.And this logic all needs to be on one place, although invoked fromeach filesystem.From: Matthew Wilcox <willy@...radead.org>Date: Fri, 13 Apr 2018 07:48:07 -0700On Tue, Apr 10, 2018 at 03:07:26PM -0700, Andres Freund wrote:I don't think that's the full issue. We can deal with the fact that anfsync failure is edge-triggered if there's a guarantee that everyprocess doing so would get it. The fact that one needs to have an FDopen from before any failing writes occurred to get a failure, THAT'Sthe big issue.Beyond postgres, it's a pretty common approach to do work on a lot offiles without fsyncing, then iterate over the directory fsynceverything, and then assume you're safe. But unless I severalymisunderstand something that'd only be safe if you kept an FD for everyfile open, which isn't realistic for pretty obvious reasons.While accepting that under memory pressure we can still evict the errorindicators, we can do a better job than we do today. The current designof error reporting says that all errors which occurred before you openedthe file descriptor are of no interest to you. I don't think that'snecessarily true, and it's actually a change of behaviour from beforethe errseq work.Consider Stupid Task A which calls open(), write(), close(), and SmartTask B which calls open(), write(), fsync(), close() operating on thesame file. If A goes entirely before B and encounters an error, beforeerrseq_t, B would see the error from A's write.If A and B overlap, even a little bit, then B still gets to see A'serror today. But if writeback happens for A's write before B opens thefile then B will never see the error.B doesn't want to see historical errors that a previous invocation ofB has already handled, but we know whether anyone has seen the erroror not. So here's a patch which restores the historical behaviour ofseeing old unhandled errors on a fresh file descriptor:Signed-off-by: Matthew Wilcox mawilcox@...rosoft.comdiff --git a/lib/errseq.c b/lib/errseq.cindex df782418b333..093f1fba4ee0 100644--- a/lib/errseq.c+++ b/lib/errseq.c@@ -119,19 +119,11 @@ EXPORT_SYMBOL(errseq_set); errseq_t errseq_sample(errseq_t *eseq) { errseq_t old = READ_ONCE(*eseq);-errseq_t new = old; -/*- * For the common case of no errors ever having been set, we can skip- * marking the SEEN bit. Once an error has been set, the value will- * never go back to zero.- */-if (old != 0) {-new |= ERRSEQ_SEEN;-if (old != new)-cmpxchg(eseq, old, new);-}-return new;+/* If nobody has seen this error yet, then we can be the first. */+if (!(old & ERRSEQ_SEEN))+old = 0;+return old; } EXPORT_SYMBOL(errseq_sample);From: Dave Chinner <david@...morbit.com>Date: Sat, 14 Apr 2018 11:47:52 +1000On Fri, Apr 13, 2018 at 07:02:32AM -0700, Matthew Wilcox wrote:On Fri, Apr 13, 2018 at 09:18:56AM -0400, Jeff Layton wrote:On Fri, 2018-04-13 at 08:44 +1000, Dave Chinner wrote:To save you looking, XFS will trash the page contents completely ona filesystem level ->writepage error. It doesn't mark them "clean",doesn't attempt to redirty and rewrite them - it clears the uptodatestate and may invalidate it completely. IOWs, the data written"sucessfully" to the cached page is now gone. It will be re-readfrom disk on the next read() call, in direct violation of the abovePOSIX requirements.This is my point: we've done that in XFS knowing that we violatePOSIX specifications in this specific corner case - it's the lesserof many evils we have to chose between. Hence if we chose to encodethat behaviour as the general writeback IO error handling algorithm,then it needs to done with the knowledge it is a specificationviolation. Not to mention be documented as a POSIX violation in thevarious relevant man pages and that this is how all filesystems willbehave on async writeback error.....Got it, thanks.Yes, I think we ought to probably do the same thing globally. It's niceto know that xfs has already been doing this. That makes me feel betterabout making this behavior the gold standard for Linux filesystems.So to summarize, at this point in the discussion, I think we want toconsider doing the following:better reporting from syncfs (report an error when even one inodefailed to be written back since last syncfs call). We'll probablyimplement this via a per-sb errseq_t in some fashion, though there aresome implementation issues to work out.invalidate or clear uptodate flag on pages that experience writebackerrors, across filesystems. Encourage this as standard behavior forfilesystems and maybe add helpers to make it easier to do this.Did I miss anything? Would that be enough to help the Pg usecase?I don't see us ever being able to reasonably support its currentexpectation that writeback errors will be seen on fd's that were openedafter the error occurred. That's a really thorny problem from an objectlifetime perspective.I think we can do better than XFS is currently doing (but I agree thatwe should have the same behaviour across all Linux filesystems!)If we get an error while wbc->for_background is true, we should not clearuptodate on the page, rather SetPageError and SetPageDirty.So you're saying we should treat it as a transient error rather thana permanent error.Background writebacks should skip pages which are PageError.That seems decidedly dodgy in the case where there is a transienterror - it requires a user to specifically run sync to get the datato disk after the transient error has occurred. Say they don'tnotice the problem because it's fleeting and doesn't cause anyobvious problems?e.g. XFS gets to enospc, runs out of reserve pool blocks so can'tallocate space to write back the page, then space is freed up a fewseconds later and so the next write will work just fine.This is a recipe for "I lost data that I wrote /days/ before thesystem crashed" bug reports.for_sync writebacks should attempt one last write. Maybe it'llsucceed this time. If it does, just ClearPageError. If not, we havesomebody to report this writeback error to, and ClearPageUptodate.Which may well be unmount. Are we really going to wait until unmountto report fatal errors?We used to do this with XFS metadata. We'd just keep trying to writemetadata and keep the filesystem running (because it's consistent inmemory and it might be a transient error) rather than shutting downthe filesystem after a couple of retries. the result was that userswouldn't notice there were problems until unmount, and the mostcommon sympton of that was "why is system shutdown hanging?".We now don't hang at unmount by default:$ cat /sys/fs/xfs/dm-0/error/fail_at_unmount 1$And we treat different errors according to their seriousness. EIOand device ENOSPC we default to retry forever because they are oftentransient, but for ENODEV we fail and shutdown immediately (someonepulled the USB stick out). metadata failure behaviour is configuredvia changing fields in /sys/fs/xfs//error/metadata//...We've planned to extend this failure configuration to data IO, too,but never quite got around to it yet. this is a clear example of"one size doesn't fit all" and I think we'll end up doing the samesort of error behaviour configuration in XFS for these cases.(i.e. /sys/fs/xfs//error/writeback//....)And this logic all needs to be on one place, although invoked fromeach filesystem.Perhaps so, but as there's no "one-size-fits-all" behaviour, Ireally want to extend the XFS error config infrastructure to controlwhat the filesystem does on error here.From: Andres Freund <andres@...razel.de>Date: Fri, 13 Apr 2018 19:04:33 -0700Hi,On 2018-04-14 11:47:52 +1000, Dave Chinner wrote:And we treat different errors according to their seriousness. EIOand device ENOSPC we default to retry forever because they are oftentransient, but for ENODEV we fail and shutdown immediately (someonepulled the USB stick out). metadata failure behaviour is configuredvia changing fields in /sys/fs/xfs//error/metadata//...We've planned to extend this failure configuration to data IO, too,but never quite got around to it yet. this is a clear example of"one size doesn't fit all" and I think we'll end up doing the samesort of error behaviour configuration in XFS for these cases.(i.e. /sys/fs/xfs//error/writeback//....)Have you considered adding an ext/fat/jfserrors=remount-ro/panic/continue style mount parameter?From: Matthew Wilcox <willy@...radead.org>Date: Fri, 13 Apr 2018 19:38:14 -0700On Sat, Apr 14, 2018 at 11:47:52AM +1000, Dave Chinner wrote:On Fri, Apr 13, 2018 at 07:02:32AM -0700, Matthew Wilcox wrote:If we get an error while wbc->for_background is true, we should not clearuptodate on the page, rather SetPageError and SetPageDirty.So you're saying we should treat it as a transient error rather thana permanent error.Yes, I'm proposing leaving the data in memory in case the user wants totry writing it somewhere else.Background writebacks should skip pages which are PageError.That seems decidedly dodgy in the case where there is a transienterror - it requires a user to specifically run sync to get the datato disk after the transient error has occurred. Say they don'tnotice the problem because it's fleeting and doesn't cause anyobvious problems?That's fair. What I want to avoid is triggering the same error every30 seconds (or whatever the periodic writeback threshold is set to).e.g. XFS gets to enospc, runs out of reserve pool blocks so can'tallocate space to write back the page, then space is freed up a fewseconds later and so the next write will work just fine.This is a recipe for "I lost data that I wrote /days/ before thesystem crashed" bug reports.So ... exponential backoff on retries?for_sync writebacks should attempt one last write. Maybe it'llsucceed this time. If it does, just ClearPageError. If not, we havesomebody to report this writeback error to, and ClearPageUptodate.Which may well be unmount. Are we really going to wait until unmountto report fatal errors?Goodness, no. The errors would be immediately reportable using the wb_errmechanism, as soon as the first error was encountered.From: bfields@...ldses.org (J. Bruce Fields)Date: Wed, 18 Apr 2018 12:52:19 -0400Theodore Y. Ts'o - 10.04.18, 20:43:First of all, what storage devices will do when they hit an exceptioncondition is quite non-deterministic. For example, the vast majorityof SSD's are not power fail certified. What this means is that ifthey suffer a power drop while they are doing a GC, it is quitepossible for data written six months ago to be lost as a result. TheLBA could potentialy be far, far away from any LBA's that wererecently written, and there could have been multiple CACHE FLUSHoperations in the since the LBA in question was last written sixmonths ago. No matter; for a consumer-grade SSD, it's possible forthat LBA to be trashed after an unexpected power drop.Pointers to documentation or papers or anything? The only googleresults I can find for "power fail certified" are your posts.I've always been confused by SSD power-loss protection, as nobody seemscompletely clear whether it's a safety or a performance feature.From: bfields@...ldses.org (J. Bruce Fields)Date: Wed, 18 Apr 2018 14:09:03 -0400On Wed, Apr 11, 2018 at 07:17:52PM -0700, Andres Freund wrote:Hi,On 2018-04-11 15:52:44 -0600, Andreas Dilger wrote:On Apr 10, 2018, at 4:07 PM, Andres Freund andres@...razel.de wrote:2018-04-10 18:43:56 Ted wrote:So for better or for worse, there has not been as much investment inbuffered I/O and data robustness in the face of exception handling ofstorage devices.That's a bit of a cop out. It's not just databases that care. Even morebasic tools like SCM, package managers and editors care whether they canproper responses back from fsync that imply things actually were synced.Sure, but it is mostly PG that is doing (IMHO) crazy things like writingto thousands(?) of files, closing the file descriptors, then expectingfsync() on a newly-opened fd to return a historical error.It's not just postgres. dpkg (underlying apt, on debian derived distros)to take an example I just randomly guessed, does too: /* We want to guarantee the extracted files are on the disk, so that the * subsequent renames to the info database do not end up with old or zero * length files in case of a system crash. As neither dpkg-deb nor tar do * explicit fsync()s, we have to do them here. * XXX: This could be avoided by switching to an internal tar extractor. */ dir_sync_contents(cidir);(a bunch of other places too)Especially on ext3 but also on newer filesystems it's performancewiseentirely infeasible to fsync() every single file individually - theperformance becomes entirely attrocious if you do that.Is that still true if you're able to use some kind of parallelism?(async io, or fsync from multiple processes?)From: Dave Chinner <david@...morbit.com>Date: Thu, 19 Apr 2018 09:59:50 +1000On Fri, Apr 13, 2018 at 07:04:33PM -0700, Andres Freund wrote:Hi,On 2018-04-14 11:47:52 +1000, Dave Chinner wrote:And we treat different errors according to their seriousness. EIOand device ENOSPC we default to retry forever because they are oftentransient, but for ENODEV we fail and shutdown immediately (someonepulled the USB stick out). metadata failure behaviour is configuredvia changing fields in /sys/fs/xfs//error/metadata//...We've planned to extend this failure configuration to data IO, too,but never quite got around to it yet. this is a clear example of"one size doesn't fit all" and I think we'll end up doing the samesort of error behaviour configuration in XFS for these cases.(i.e. /sys/fs/xfs//error/writeback//....)Have you considered adding an ext/fat/jfserrors=remount-ro/panic/continue style mount parameter?That's for metadata writeback error behaviour, not data writebackIO errors.We are definitely not planning to add mount options to configure IOerror behaviors. Mount options are a horrible way to configurefilesystem behaviour and we've already got other, fine-grainedconfiguration infrastructure for configuring IO error behaviour.Which, as I just pointed out, was designed to be be extended to datawriteback and other operational error handling in the filesystem(e.g. dealing with ENOMEM in different ways).From: Dave Chinner <david@...morbit.com>Date: Thu, 19 Apr 2018 10:13:43 +1000On Fri, Apr 13, 2018 at 07:38:14PM -0700, Matthew Wilcox wrote:On Sat, Apr 14, 2018 at 11:47:52AM +1000, Dave Chinner wrote:On Fri, Apr 13, 2018 at 07:02:32AM -0700, Matthew Wilcox wrote:If we get an error while wbc->for_background is true, we should not clearuptodate on the page, rather SetPageError and SetPageDirty.So you're saying we should treat it as a transient error rather thana permanent error.Yes, I'm proposing leaving the data in memory in case the user wants totry writing it somewhere else.And if it's getting IO errors because of USB stick pull? Whatthen?Background writebacks should skip pages which are PageError.That seems decidedly dodgy in the case where there is a transienterror - it requires a user to specifically run sync to get the datato disk after the transient error has occurred. Say they don'tnotice the problem because it's fleeting and doesn't cause anyobvious problems?That's fair. What I want to avoid is triggering the same error every30 seconds (or whatever the periodic writeback threshold is set to).So if kernel ring buffer overflows and so users miss the first errorreport, they'll have no idea that the data writeback is stillfailing?e.g. XFS gets to enospc, runs out of reserve pool blocks so can'tallocate space to write back the page, then space is freed up a fewseconds later and so the next write will work just fine.This is a recipe for "I lost data that I wrote /days/ before thesystem crashed" bug reports.So ... exponential backoff on retries?Maybe, but I don't think that actually helps anything and adds yetmore "when should we write this" complication to inode writeback....for_sync writebacks should attempt one last write. Maybe it'llsucceed this time. If it does, just ClearPageError. If not, we havesomebody to report this writeback error to, and ClearPageUptodate.Which may well be unmount. Are we really going to wait until unmountto report fatal errors?Goodness, no. The errors would be immediately reportable using the wb_errmechanism, as soon as the first error was encountered.But if there are no open files when the error occurs, that errorwon't get reported to anyone. Which means the next time anyoneaccesses that inode from a user context could very well be unmountor a third party sync/syncfs()....From: Eric Sandeen <esandeen@...hat.com>Date: Wed, 18 Apr 2018 19:23:46 -0500On 4/18/18 6:59 PM, Dave Chinner wrote:On Fri, Apr 13, 2018 at 07:04:33PM -0700, Andres Freund wrote:Hi,On 2018-04-14 11:47:52 +1000, Dave Chinner wrote:And we treat different errors according to their seriousness. EIOand device ENOSPC we default to retry forever because they are oftentransient, but for ENODEV we fail and shutdown immediately (someonepulled the USB stick out). metadata failure behaviour is configuredvia changing fields in /sys/fs/xfs//error/metadata//...We've planned to extend this failure configuration to data IO, too,but never quite got around to it yet. this is a clear example of"one size doesn't fit all" and I think we'll end up doing the samesort of error behaviour configuration in XFS for these cases.(i.e. /sys/fs/xfs//error/writeback//....)Have you considered adding an ext/fat/jfserrors=remount-ro/panic/continue style mount parameter?That's for metadata writeback error behaviour, not data writebackIO errors./me points casually at data_err=abort & data_err=ignore in ext4... data_err=ignore Just print an error message if an error occurs in a file data buffer in ordered mode. data_err=abort Abort the journal if an error occurs in a file data buffer in ordered mode.Just sayin'We are definitely not planning to add mount options to configure IOerror behaviors. Mount options are a horrible way to configurefilesystem behaviour and we've already got other, fine-grainedconfiguration infrastructure for configuring IO error behaviour.Which, as I just pointed out, was designed to be be extended to datawriteback and other operational error handling in the filesystem(e.g. dealing with ENOMEM in different ways).I don't disagree, but there are already mount-option knobs in ext4, FWIW.From: Matthew Wilcox <willy@...radead.org>Date: Wed, 18 Apr 2018 17:40:37 -0700On Thu, Apr 19, 2018 at 10:13:43AM +1000, Dave Chinner wrote:On Fri, Apr 13, 2018 at 07:38:14PM -0700, Matthew Wilcox wrote:On Sat, Apr 14, 2018 at 11:47:52AM +1000, Dave Chinner wrote:On Fri, Apr 13, 2018 at 07:02:32AM -0700, Matthew Wilcox wrote:If we get an error while wbc->for_background is true, we should not clearuptodate on the page, rather SetPageError and SetPageDirty.So you're saying we should treat it as a transient error rather thana permanent error.Yes, I'm proposing leaving the data in memory in case the user wants totry writing it somewhere else.And if it's getting IO errors because of USB stick pull? Whatthen?I've been thinking about this. Ideally we want to pass some kind ofnotification all the way up to the desktop and tell the user to plug thedamn stick back in. Then have the USB stick become the same blockdevthat it used to be, and complete the writeback. We are so far frombeing able to do that right now that it's not even funny.Background writebacks should skip pages which are PageError.That seems decidedly dodgy in the case where there is a transienterror - it requires a user to specifically run sync to get the datato disk after the transient error has occurred. Say they don'tnotice the problem because it's fleeting and doesn't cause anyobvious problems?That's fair. What I want to avoid is triggering the same error every30 seconds (or whatever the periodic writeback threshold is set to).So if kernel ring buffer overflows and so users miss the first errorreport, they'll have no idea that the data writeback is stillfailing?I wasn't thinking about kernel ringbuffer based reporting; I was thinkingabout errseq_t based reporting, so the application can tell the fsyncfailed and maybe does something application-level to recover like sendthe transactions across to another node in the cluster (or whatever thishypothetical application is).for_sync writebacks should attempt one last write. Maybe it'llsucceed this time. If it does, just ClearPageError. If not, we havesomebody to report this writeback error to, and ClearPageUptodate.Which may well be unmount. Are we really going to wait until unmountto report fatal errors?Goodness, no. The errors would be immediately reportable using the wb_errmechanism, as soon as the first error was encountered.But if there are no open files when the error occurs, that errorwon't get reported to anyone. Which means the next time anyoneaccesses that inode from a user context could very well be unmountor a third party sync/syncfs()....Right. But then that's on the application.From: "Theodore Y. Ts'o" <tytso@....edu>Date: Wed, 18 Apr 2018 21:08:19 -0400On Wed, Apr 18, 2018 at 05:40:37PM -0700, Matthew Wilcox wrote:I've been thinking about this. Ideally we want to pass some kind ofnotification all the way up to the desktop and tell the user to plug thedamn stick back in. Then have the USB stick become the same blockdevthat it used to be, and complete the writeback. We are so far frombeing able to do that right now that it's not even funny.oMaybe we shouldn't be trying to do any of this in the kernel, or atleast as little as possible in the kernel? Perhaps it would be betterto do most of this as a device mapper hack; I suspect we'll needuserspace help to igure out whether the user has plugged the same USBstick in, or a different USB stick, anyway.From: Christoph Hellwig <hch@...radead.org>Date: Thu, 19 Apr 2018 01:39:04 -0700On Wed, Apr 18, 2018 at 12:52:19PM -0400, J. Bruce Fields wrote:Theodore Y. Ts'o - 10.04.18, 20:43:First of all, what storage devices will do when they hit an exceptioncondition is quite non-deterministic. For example, the vast majorityof SSD's are not power fail certified. What this means is that ifthey suffer a power drop while they are doing a GC, it is quitepossible for data written six months ago to be lost as a result. TheLBA could potentialy be far, far away from any LBA's that wererecently written, and there could have been multiple CACHE FLUSHoperations in the since the LBA in question was last written sixmonths ago. No matter; for a consumer-grade SSD, it's possible forthat LBA to be trashed after an unexpected power drop.Pointers to documentation or papers or anything? The only googleresults I can find for "power fail certified" are your posts.I've always been confused by SSD power-loss protection, as nobody seemscompletely clear whether it's a safety or a performance feature.Devices from reputable vendors should always be power fail safe, bugsnotwithstanding. What power-loss protection in marketing slides usuallymeans is that an SSD has a non-volatile write cache. That is once awrite is ACKed data is persisted and no additional cache flush needs tobe sent. This is a feature only available in expensive eterprise SSDsas the required capacitors are expensive. Cheaper consumer or bootdriver SSDs have a volatile write cache, that is we need to do aseparate cache flush to persist data (REQ_OP_FLUSH in Linux). Buta reasonable implementation of those still won't corrupt previouslywritten data, they will just lose the volatile write cache that hasn'tbeen flushed. Occasional bugs, bad actors or other issues might stillhappen.From: "J. Bruce Fields" <bfields@...ldses.org>Date: Thu, 19 Apr 2018 10:10:16 -0400On Thu, Apr 19, 2018 at 01:39:04AM -0700, Christoph Hellwig wrote:On Wed, Apr 18, 2018 at 12:52:19PM -0400, J. Bruce Fields wrote:Theodore Y. Ts'o - 10.04.18, 20:43:First of all, what storage devices will do when they hit an exceptioncondition is quite non-deterministic. For example, the vast majorityof SSD's are not power fail certified. What this means is that ifthey suffer a power drop while they are doing a GC, it is quitepossible for data written six months ago to be lost as a result. TheLBA could potentialy be far, far away from any LBA's that wererecently written, and there could have been multiple CACHE FLUSHoperations in the since the LBA in question was last written sixmonths ago. No matter; for a consumer-grade SSD, it's possible forthat LBA to be trashed after an unexpected power drop.Pointers to documentation or papers or anything? The only googleresults I can find for "power fail certified" are your posts.I've always been confused by SSD power-loss protection, as nobody seemscompletely clear whether it's a safety or a performance feature.Devices from reputable vendors should always be power fail safe, bugsnotwithstanding. What power-loss protection in marketing slides usuallymeans is that an SSD has a non-volatile write cache. That is once awrite is ACKed data is persisted and no additional cache flush needs tobe sent. This is a feature only available in expensive eterprise SSDsas the required capacitors are expensive. Cheaper consumer or bootdriver SSDs have a volatile write cache, that is we need to do aseparate cache flush to persist data (REQ_OP_FLUSH in Linux). Buta reasonable implementation of those still won't corrupt previouslywritten data, they will just lose the volatile write cache that hasn'tbeen flushed. Occasional bugs, bad actors or other issues might stillhappen.Thanks! That was my understanding too. But then the name is terrible.As is all the vendor documentation I can find:https://insights.samsung.com/2016/03/22/power-loss-protection-how-ssds-are-protecting-data-integrity-white-paper/"Power loss protection is a critical aspect of ensuring data integrity, especially in servers or data centers."https://www.intel.com/content/.../ssd-320-series-power-loss-data-protection-brief.pdf"Data safety features prepare for unexpected power-loss and protect system and user data."Why do they all neglect to mention that their consumer drives are alsoperfectly capable of well-defined behavior after power loss, just at theexpense of flush performance? It's ridiculously confusing.From: Matthew Wilcox <willy@...radead.org>Date: Thu, 19 Apr 2018 10:40:10 -0700On Wed, Apr 18, 2018 at 09:08:19PM -0400, Theodore Y. Ts'o wrote:On Wed, Apr 18, 2018 at 05:40:37PM -0700, Matthew Wilcox wrote:I've been thinking about this. Ideally we want to pass some kind ofnotification all the way up to the desktop and tell the user to plug thedamn stick back in. Then have the USB stick become the same blockdevthat it used to be, and complete the writeback. We are so far frombeing able to do that right now that it's not even funny.oMaybe we shouldn't be trying to do any of this in the kernel, or atleast as little as possible in the kernel? Perhaps it would be betterto do most of this as a device mapper hack; I suspect we'll needuserspace help to igure out whether the user has plugged the same USBstick in, or a different USB stick, anyway.The device mapper target (dm-removable?) was my first idea too, but I keptthinking through use cases and I think we end up wanting this functionalityin the block layer. Let's try a story.Stephen the PFY goes into the data centre looking to hotswap a faileddrive. Due to the eight pints of lager he had for lunch, he pulls outthe root drive instead of the failed drive. The air raid siren warblesand he realises his mistake, shoving the drive back in.CYOA:Currently: All writes are lost, calamities ensue. The PFY is fired.With dm-removable: Nobody thought to set up dm-removable on the rootdrive. Calamities still ensue, but now it's the BOFH's fault insteadof the PFY's fault.Built into the block layer: After a brief hiccup while we reattach thedrive to its block_device, the writes resume and nobody loses their job.From: "Theodore Y. Ts'o" <tytso@....edu>Date: Thu, 19 Apr 2018 19:27:15 -0400On Thu, Apr 19, 2018 at 10:40:10AM -0700, Matthew Wilcox wrote:With dm-removable: Nobody thought to set up dm-removable on the rootdrive. Calamities still ensue, but now it's the BOFH's fault insteadof the PFY's fault.Built into the block layer: After a brief hiccup while we reattach thedrive to its block_device, the writes resume and nobody loses their job.What you're talking about is a deployment issue, though. Ultimatelythe distribution will set up dm-removable automatically if the userrequests it, much like it sets up dm-crypt automatically for laptopusers upon request.My concern is that not all removable devices have a globally unique idnumber available in hardware so the kernel can tell whether or notit's the same device that has been plugged in. There are hueristicsyou could use -- for example, you could look at the file system uuidplus the last fsck time. But they tend to be very file systemspecific, and not things we would want ot have in the kernel.From: Dave Chinner <david@...morbit.com>Date: Fri, 20 Apr 2018 09:28:59 +1000On Wed, Apr 18, 2018 at 05:40:37PM -0700, Matthew Wilcox wrote:On Thu, Apr 19, 2018 at 10:13:43AM +1000, Dave Chinner wrote:On Fri, Apr 13, 2018 at 07:38:14PM -0700, Matthew Wilcox wrote:On Sat, Apr 14, 2018 at 11:47:52AM +1000, Dave Chinner wrote:On Fri, Apr 13, 2018 at 07:02:32AM -0700, Matthew Wilcox wrote:If we get an error while wbc->for_background is true, we should not clearuptodate on the page, rather SetPageError and SetPageDirty.So you're saying we should treat it as a transient error rather thana permanent error.Yes, I'm proposing leaving the data in memory in case the user wants totry writing it somewhere else.And if it's getting IO errors because of USB stick pull? Whatthen?I've been thinking about this. Ideally we want to pass some kind ofnotification all the way up to the desktop and tell the user to plug thedamn stick back in. Then have the USB stick become the same blockdevthat it used to be, and complete the writeback. We are so far frombeing able to do that right now that it's not even funny.nodBut in the meantime, device unplug (should give ENODEV, not EIO) isa fatal error and we need to toss away the data.Background writebacks should skip pages which are PageError.That seems decidedly dodgy in the case where there is a transienterror - it requires a user to specifically run sync to get the datato disk after the transient error has occurred. Say they don'tnotice the problem because it's fleeting and doesn't cause anyobvious problems?That's fair. What I want to avoid is triggering the same error every30 seconds (or whatever the periodic writeback threshold is set to).So if kernel ring buffer overflows and so users miss the first errorreport, they'll have no idea that the data writeback is stillfailing?I wasn't thinking about kernel ringbuffer based reporting; I was thinkingabout errseq_t based reporting, so the application can tell the fsyncfailed and maybe does something application-level to recover like sendthe transactions across to another node in the cluster (or whatever thishypothetical application is).But if it's still failing, then we should be still trying to reportthe error. i.e. if fsync fails and the page remains dirty, then thenext attmept to write it is a new error and fsync should reportthat. IOWs, I think we should be returning errors at every occasionerrors need to be reported if we have a persistent writebackfailure...for_sync writebacks should attempt one last write. Maybe it'llsucceed this time. If it does, just ClearPageError. If not, we havesomebody to report this writeback error to, and ClearPageUptodate.Which may well be unmount. Are we really going to wait until unmountto report fatal errors?Goodness, no. The errors would be immediately reportable using the wb_errmechanism, as soon as the first error was encountered.But if there are no open files when the error occurs, that errorwon't get reported to anyone. Which means the next time anyoneaccesses that inode from a user context could very well be unmountor a third party sync/syncfs()....Right. But then that's on the application.Which we know don't do the right thing. Seems like a lot of hoops tojump through given it still won't work if the appliction isn'tchanged to support linux specific error handling requirements...From: Jan Kara <jack@...e.cz>Date: Sat, 21 Apr 2018 18:59:54 +0200On Fri 13-04-18 07:48:07, Matthew Wilcox wrote:On Tue, Apr 10, 2018 at 03:07:26PM -0700, Andres Freund wrote:I don't think that's the full issue. We can deal with the fact that anfsync failure is edge-triggered if there's a guarantee that everyprocess doing so would get it. The fact that one needs to have an FDopen from before any failing writes occurred to get a failure, THAT'Sthe big issue.Beyond postgres, it's a pretty common approach to do work on a lot offiles without fsyncing, then iterate over the directory fsynceverything, and then assume you're safe. But unless I severalymisunderstand something that'd only be safe if you kept an FD for everyfile open, which isn't realistic for pretty obvious reasons.While accepting that under memory pressure we can still evict the errorindicators, we can do a better job than we do today. The current designof error reporting says that all errors which occurred before you openedthe file descriptor are of no interest to you. I don't think that'snecessarily true, and it's actually a change of behaviour from beforethe errseq work.Consider Stupid Task A which calls open(), write(), close(), and SmartTask B which calls open(), write(), fsync(), close() operating on thesame file. If A goes entirely before B and encounters an error, beforeerrseq_t, B would see the error from A's write.If A and B overlap, even a little bit, then B still gets to see A'serror today. But if writeback happens for A's write before B opens thefile then B will never see the error.B doesn't want to see historical errors that a previous invocation ofB has already handled, but we know whether anyone has seen the erroror not. So here's a patch which restores the historical behaviour ofseeing old unhandled errors on a fresh file descriptor:Signed-off-by: Matthew Wilcox mawilcox@...rosoft.comSo I agree with going to the old semantics of reporting errors from beforea file was open at least once to someone. As the PG case shows apps areindeed relying on the old behavior. As much as it is unreliable, it ends updoing the right thing for these apps in 99% of cases and we shouldn't breakthem (BTW IMO the changelog should contain a note that this fixes aregression of PostgreSQL, a reference to this thread and CC to stable).Anyway feel free to add:Reviewed-by: Jan Kara jack@...e.czOh, and to make myself clear I do think we need to find a better way ofreporting IO errors. I consider this just an immediate band-aid to avoiduserspace regressions.diff --git a/lib/errseq.c b/lib/errseq.cindex df782418b333..093f1fba4ee0 100644--- a/lib/errseq.c+++ b/lib/errseq.c@@ -119,19 +119,11 @@ EXPORT_SYMBOL(errseq_set); errseq_t errseq_sample(errseq_t *eseq) { errseq_t old = READ_ONCE(*eseq);- errseq_t new = old;/** For the common case of no errors ever having been set, we can skip* marking the SEEN bit. Once an error has been set, the value will* never go back to zero.*/if (old != 0) {new |= ERRSEQ_SEEN;if (old != new)cmpxchg(eseq, old, new);}return new;/* If nobody has seen this error yet, then we can be the first. */if (!(old & ERRSEQ_SEEN))old = 0;return old;From: Jan Kara <jack@...e.cz>Date: Sat, 21 Apr 2018 20:14:29 +0200On Thu 12-04-18 07:09:14, Jeff Layton wrote:On Wed, 2018-04-11 at 20:02 -0700, Matthew Wilcox wrote:At the moment, when we open a file, we sample the current state of thewriteback error and only report new errors. We could set it to zeroinstead, and report the most recent error as soon as anything happenswhich would report an error. That way err = close(open("file")); wouldreport the most recent error.That's not going to be persistent across the data structure for that inodebeing removed from memory; we'd need filesystem support for persistingthat. But maybe it's "good enough" to only support it for recent files.Jeff, what do you think?I hate it :). We could do that, but....yecchhhh.Reporting errors only in the case where the inode happened to stickaround in the cache seems too unreliable for real-world usage, and mightbe problematic for some use cases. I'm also not sure it would really behelpful.So this is never going to be perfect but I think we could do good enoughby:1) Mark inodes that hit IO error.2) If the inode gets evicted from memory we store the fact that we hit anerror for this IO in a more space efficient data structure (sparse bitmap,radix tree, extent tree, whatever).3) If the underlying device gets destroyed, we can just switch the whole SBto an error state and forget per inode info.4) If there's too much of per-inode error info (probably per-fs configurablelimit in terms of number of inodes), we would yell in the kernel log,switch the whole fs to the error state and forget per inode info.This way there won't be silent loss of IO errors. Memory usage would bereasonably limited. It could happen the whole fs would switch to error state"prematurely" but if that's a problem for the machine, admin could tune thelimit for number of inodes to keep IO errors for...I think the crux of the matter here is not really about error reporting,per-se.I think this is related but a different question.I asked this at LSF last year, and got no real answer:When there is a writeback error, what should be done with the dirtypage(s)? Right now, we usually just mark them clean and carry on. Isthat the right thing to do?One possibility would be to invalidate the range that failed to bewritten (or the whole file) and force the pages to be faulted in againon the next access. It could be surprising for some applications to notsee the results of their writes on a subsequent read after such anevent.Maybe that's ok in the face of a writeback error though? IDK.I can see the admin wanting to rather kill the machine with OOM than havingto deal with data loss due to IO errors (e.g. if he has HA server fail overset up). Or retry for some time before dropping the dirty data. Or dowhat we do now (possibly with invalidating pages as you say). As Dave saidelsewhere there's not one strategy that's going to please everybody. So itmight be beneficial to have this configurable like XFS has it for metadata.OTOH if I look at the problem from application developer POV, most appswill just declare game over at the face of IO errors (if they take care tocheck for them at all). And the sophisticated apps that will try some kindof error recovery have to be prepared that the data is just gone (asdepending on what exactly the kernel does is rather fragile) so I'm notsure how much practical value the configurable behavior on writeback errorswould bring.
I've had this nagging feeling that the computers I use today feel slower than the computers I used as a kid. As a rule, I don t trust this kind of feeling because human perception has been shown to be unreliable in empirical studies, so I carried around a high-speed camera and measured the response latency of devices I ve run into in the past few months. Here are the results:table {border-collapse:collapse;margin:0px auto;}table,th,td {border: 1px solid black;}td {text-align:center;}td.l {text-align:left;}computerlatency(ms)yearclock# Tapple 2e3019831 MHz3.5kti 99/4a4019813 MHz8kcustom haswell-e 165Hz5020143.5 GHz2Gcommodore pet 40166019771 MHz3.5ksgi indy601993.1 GHz1.2Mcustom haswell-e 120Hz6020143.5 GHz2Gthinkpad 13 chromeos7020172.3 GHz1Gimac g4 os 9702002.8 GHz11Mcustom haswell-e 60Hz8020143.5 GHz2Gmac color classic90199316 MHz273kpowerspec g405 linux 60Hz9020174.2 GHz2Gmacbook pro 201410020142.6 GHz700Mthinkpad 13 linux chroot10020172.3 GHz1Glenovo x1 carbon 4g linux11020162.6 GHz1Gimac g4 os x1202002.8 GHz11Mcustom haswell-e 24Hz14020143.5 GHz2Glenovo x1 carbon 4g win15020162.6 GHz1Gnext cube150198825 MHz1.2Mpowerspec g405 linux17020174.2 GHz2Gpacket around the world190powerspec g405 win20020174.2 GHz2Gsymbolics 362030019865 MHz390kThese are tests of the latency between a keypress and the display of a character in a terminal (see appendix for more details). The results are sorted from quickest to slowest. In the latency column, the background goes from green to yellow to red to black as devices get slower and the background gets darker as devices get slower. No devices are green. When multiple OSes were tested on the same machine, the os is in bold. When multiple refresh rates were tested on the same machine, the refresh rate is in italics.In the year column, the background gets darker and purple-er as devices get older. If older devices were slower, we d see the year column get darker as we read down the chart.The next two columns show the clock speed and number of transistors in the processor. Smaller numbers are darker and blue-er. As above, if slower clocked and smaller chips correlated with longer latency, the columns would get darker as we go down the table, but it, if anything, seems to be the other way around.For reference, the latency of a packet going around the world through fiber from NYC back to NYC via Tokyo and London is inserted in the table.If we look at overall results, the fastest machines are ancient. Newer machines are all over the place. Fancy gaming rigs with unusually high refresh-rate displays are almost competitive with machines from the late 70s and early 80s, but normal modern computers can t compete with thirty to forty year old machines.We can also look at mobile devices. In this case, we ll look at scroll latency in the browser:devicelatency(ms)yearipad pro 10.5" pencil302017ipad pro 10.5"702017iphone 4s702011iphone 6s702015iphone 3gs702009iphone x802017iphone 8802017iphone 7802016iphone 6802014gameboy color801998iphone 5902012blackberry q101002013huawei honor 81102016google pixel 2 xl1102017galaxy s71202016galaxy note 31202016moto x1202013nexus 5x1202015oneplus 3t1302016blackberry key one1302017moto e (2g)1402015moto g4 play1402017moto g4 plus1402016google pixel1402016samsung galaxy avant1502014asus zenfone3 max1502016sony xperia z5 compact1502015htc one m41602013galaxy s4 mini1702013lg k41802016packet190htc rezound2402011palm pilot 10004901996kindle oasis 25702017kindle paperwhite 36302015kindle 48602011As above, the results are sorted by latency and color-coded from green to yellow to red to black as devices get slower. Also as above, the year gets purple-er (and darker) as the device gets older.If we exclude the game boy color, which is a different class of device than the rest, all of the quickest devices are Apple phones or tablets. The next quickest device is the blackberry q10. Although we don t have enough data to really tell why the blackberry q10 is unusually quick for a non-Apple device, one plausible guess is that it s helped by having actual buttons, which are easier to implement with low latency than a touchscreen. The other two devices with actual buttons are the gameboy color and the kindle 4.After that iphones and non-kindle button devices, we have a variety of Android devices of various ages. At the bottom, we have the ancient palm pilot 1000 followed by the kindles. The palm is hamstrung by a touchscreen and display created in an era with much slower touchscreen technology and the kindles use e-ink displays, which are much slower than the displays used on modern phones, so it s not surprising to see those devices at the bottom.Why is the apple 2e so fast?Compared to a modern computer that s not the latest ipad pro, the apple 2 has significant advantages on both the input and the output, and it also has an advantage between the input and the output for all but the most carefully written code since the apple 2 doesn t have to deal with context switches, buffers involved in handoffs between different processes, etc.On the input, if we look at modern keyboards, it s common to see them scan their inputs at 100 Hz to 200 Hz (e.g., the ergodox claims to scan at 167 Hz). By comparison, the apple 2e effectively scans at 556 Hz. See appendix for details.If we look at the other end of the pipeline, the display, we can also find latency bloat there. I have a display that advertises 1 ms switching on the box, but if we look at how long it takes for the display to actually show a character from when you can first see the trace of it on the screen until the character is solid, it can easily be 10 ms. You can even see this effect with some high-refresh-rate displays that are sold on their allegedly good latency.At 144 Hz, each frame takes 7 ms. A change to the screen will have 0 ms to 7 ms of extra latency as it waits for the next frame boundary before getting rendered (on average,we expect half of the maximum latency, or 3.5 ms). On top of that, even though my display at home advertises a 1 ms switching time, it actually appears to take 10 ms to fully change color once the display has started changing color. When we add up the latency from waiting for the next frame to the latency of an actual color change, we get an expected latency of 7/2 + 10 = 13.5msWith the old CRT in the apple 2e, we d expect half of a 60 Hz refresh (16.7 ms / 2) plus a negligible delay, or 8.3 ms. That s hard to beat today: a state of the art gaming monitor can get the total display latency down into the same range, but in terms of marketshare, very few people have such displays, and even displays that are advertised as being fast aren t always actually fast.iOS rendering pipelineIf we look at what s happening between the input and the output, the differences between a modern system and an apple 2e are too many to describe without writing an entire book. To get a sense of the situation in modern machines, here s former iOS/UIKit engineer Andy Matuschak s high-level sketch of what happens on iOS, which he says should be presented with the disclaimer that this is my out of date memory of out of date information :hardware has its own scanrate (e.g. 120 Hz for recent touch panels), so that can introduce up to 8 ms latencyevents are delivered to the kernel through firmware; this is relatively quick but system scheduling concerns may introduce a couple ms herethe kernel delivers those events to privileged subscribers (here, backboardd) over a mach port; more scheduling loss possiblebackboardd must determine which process should receive the event; this requires taking a lock against the window server, which shares that information (a trip back into the kernel, more scheduling delay)backboardd sends that event to the process in question; more scheduling delay possible before it is processedthose events are only dequeued on the main thread; something else may be happening on the main thread (e.g. as result of a timer or network activity), so some more latency may result, depending on that workUIKit introduced 1-2 ms event processing overhead, CPU-boundapplication decides what to do with the event; apps are poorly written, so usually this takes many ms. the consequences are batched up in a data-driven update which is sent to the render server over IPCIf the app needs a new shared-memory video buffer as a consequence of the event, which will happen anytime something non-trivial is happening, that will require round-trip IPC to the render server; more scheduling delays(trivial changes are things which the render server can incorporate itself, like affine transformation changes or color changes to layers; non-trivial changes include anything that has to do with text, most raster and vector operations)These kinds of updates often end up being triple-buffered: the GPU might be using one buffer to render right now; the render server might have another buffer queued up for its next frame; and you want to draw into another. More (cross-process) locking here; more trips into kernel-land.the render server applies those updates to its render tree (a few ms)every N Hz, the render tree is flushed to the GPU, which is asked to fill a video bufferActually, though, there s often triple-buffering for the screen buffer, for the same reason I described above: the GPU s drawing into one now; another might be being read from in preparation for another frameevery N Hz, that video buffer is swapped with another video buffer, and the display is driven directly from that memory(this N Hz isn t necessarily ideally aligned with the preceding step s N Hz)Andy says the actual amount of work happening here is typically quite small. A few ms of CPU time. Key overhead comes from: periodic scanrates (input device, render server, display) imperfectly alignedmany handoffs across process boundaries, each an opportunity for something else to get scheduled instead of the consequences of the input eventlots of locking, especially across process boundaries, necessitating trips into kernel-landBy comparison, on the Apple 2e, there basically aren t handoffs, locks, or process boundaries. Some very simple code runs and writes the result to the display memory, which causes the display to get updated on the next scan.Refresh rate vs. latencyOne thing that s curious about the computer results is the impact of refresh rate. We get a 90 ms improvement from going from 24 Hz to 165 Hz. At 24 Hz each frame takes 41.67 ms and at 165 Hz each frame takes 6.061 ms. As we saw above, if there weren t any buffering, we d expect the average latency added by frame refreshes to be 20.8ms in the former case and 3.03 ms in the latter case (because we d expect to arrive at a uniform random point in the frame and have to wait between 0ms and the full frame time), which is a difference of about 18ms. But the difference is actually 90 ms, implying we have latency equivalent to (90 - 18) / (41.67 - 6.061) = 2 buffered frames.If we plot the results from the other refresh rates on the same machine (not shown), we can see that they re roughly in line with a best fit curve that we get if we assume that, for that machine running powershell, we get 2.5 frames worth of latency regardless of refresh rate. This lets us estimate what the latency would be if we equipped this low latency gaming machine with an infinity Hz display -- we d expect latency to be 140 - 2.5 * 41.67 = 36 ms, almost as fast as quick but standard machines from the 70s and 80s.ComplexityAlmost every computer and mobile device that people buy today is slower than common models of computers from the 70s and 80s. Low-latency gaming desktops and the ipad pro can get into the same range as quick machines from thirty to forty years ago, but most off-the-shelf devices aren t even close.If we had to pick one root cause of latency bloat, we might say that it s because of complexity . Of course, we all know that complexity is bad. If you ve been to a non-academic non-enterprise tech conference in the past decade, there s a good chance that there was at least one talk on how complexity is the root of all evil and we should aspire to reduce complexity.Unfortunately, it's a lot harder to remove complexity than to give a talk saying that we should remove complexity. A lot of the complexity buys us something, either directly or indirectly. When we looked at the input of a fancy modern keyboard vs. the apple 2 keyboard, we saw that using a relatively powerful and expensive general purpose processor to handle keyboard inputs can be slower than dedicated logic for the keyboard, which would both be simpler and cheaper. However, using the processor gives people the ability to easily customize the keyboard, and also pushes the problem of programming the keyboard from hardware into software, which reduces the cost of making the keyboard. The more expensive chip increases the manufacturing cost, but considering how much of the cost of these small-batch artisanal keyboards is the design cost, it seems like a net win to trade manufacturing cost for ease of programming.We see this kind of tradeoff in every part of the pipeline. One of the biggest examples of this is the OS you might run on a modern desktop vs. the loop that s running on the apple 2. Modern OSes let programmers write generic code that can deal with having other programs simultaneously running on the same machine, and do so with pretty reasonable general performance, but we pay a huge complexity cost for this and the handoffs involved in making this easy result in a significant latency penalty.A lot of the complexity might be called accidental complexity, but most of that accidental complexity is there because it s so convenient. At every level from the hardware architecture to the syscall interface to the I/O framework we use, we take on complexity, much of which could be eliminated if we could sit down and re-write all of the systems and their interfaces today, but it s too inconvenient to re-invent the universe to reduce complexity and we get benefits from economies of scale, so we live with what we have.For those reasons and more, in practice, the solution to poor performance caused by excess complexity is often to add more complexity. In particular, the gains we ve seen that get us back to the quickness of the quickest machines from thirty to forty years ago have come not from listening to exhortations to reduce complexity, but from piling on more complexity.The ipad pro is a feat of modern engineering; the engineering that went into increasing the refresh rate on both the input and the output as well as making sure the software pipeline doesn t have unnecessary buffering is complex! The design and manufacture of high-refresh-rate displays that can push system latency down is also non-trivially complex in ways that aren t necessary for bog standard 60 Hz displays.This is actually a common theme when working on latency reduction. A common trick to reduce latency is to add a cache, but adding a cache to a system makes it more complex. For systems that generate new data and can t tolerate a cache, the solutions are often even more complex. An example of this might be large scale RoCE deployments. These can push remote data access latency from from the millisecond range down to the microsecond range, which enables new classes of applications. However, this has come at a large cost in complexity. Early large-scale RoCE deployments easily took tens of person years of effort to get right and also came with a tremendous operational burden.ConclusionIt s a bit absurd that a modern gaming machine running at 4,000x the speed of an apple 2, with a CPU that has 500,000x as many transistors (with a GPU that has 2,000,000x as many transistors) can maybe manage the same latency as an apple 2 in very carefully coded applications if we have a monitor with nearly 3x the refresh rate. It s perhaps even more absurd that the default configuration of the powerspec g405, which had the fastest single-threaded performance you could get until October 2017, had more latency from keyboard-to-screen (approximately 3 feet, maybe 10 feet of actual cabling) than sending a packet around the world (16187 mi from NYC to Tokyo to London back to NYC, more due to the cost of running the shortest possible length of fiber).On the bright side, we re arguably emerging from the latency dark ages and it s now possible to assemble a computer or buy a tablet with latency that s in the same range as you could get off-the-shelf in the 70s and 80s. This reminds me a bit of the screen resolution & density dark ages, where CRTs from the 90s offered better resolution and higher pixel density than affordable non-laptop LCDs until relatively recently. 4k displays have now become normal and affordable 8k displays are on the horizon, blowing past anything we saw on consumer CRTs. I don t know that we ll see the same kind improvement with respect to latency, but one can hope. There are individual developers improving the experience for people who use certain, very carefully coded, applications, but it's not clear what force could cause a significant improvement in the default experience most users see.Other posts on latency measurementTerminal latencyKeyboard latencyMouse vs. keyboard latency (human factors, not device latency)Editor latency (by Pavel Fatin)Windows 10 compositing latency (by Pekka Vaananen)AR/VR latency (by Michael Abrash)Latency mitigation strategies (by John Carmack)Appendix: why measure latency?Latency matters! For very simple tasks, people can perceive latencies down to 2 ms or less. Moreover, increasing latency is not only noticeable to users, it causes users to execute simple tasks less accurately. If you want a visual demonstration of what latency looks like and you don t have a super-fast old computer lying around, check out this MSR demo on touchscreen latency.The most commonly cited document on response time is the nielsen group article on response times, which claims that latncies below 100ms feel equivalent and perceived as instantaneous. One easy way to see that this is false is to go into your terminal and try sleep 0; echo "pong" vs. sleep 0.1; echo "test" (or for that matter, try playing an old game that doesn't have latency compensation, like quake 1, with 100 ms ping, or even 30 ms ping, or try typing in a terminal with 30 ms ping). For more info on this and other latency fallacies, see this document on common misconceptions about latency.Throughput also matters, but this is widely understood and measured. If you go to pretty much any mainstream review or benchmarking site, you can find a wide variety of throughput measurements, so there s less value in writing up additional throughput measurements.Appendix: apple 2 keyboardThe apple 2e, instead of using a programmed microcontroller to read the keyboard, uses a much simpler custom chip designed for reading keyboard input, the AY 3600. If we look at the AY 3600 datasheet,we can see that the scan time is (90 * 1/f) and the debounce time is listed as strobe_delay. These quantities are determined by some capacitors and a resistor, which appear to be 47pf, 100k ohms, and 0.022uf for the Apple 2e. Plugging these numbers into the AY3600 datasheet, we can see that f = 50 kHz, giving us a 1.8 ms scan delay and a 6.8 ms debounce delay (assuming the values are accurate -- capacitors can degrade over time, so we should expect the real delays to be shorter on our old Apple 2e), giving us less than 8.6 ms for the internal keyboard logic.Comparing to a keyboard with a 167 Hz scan rate that scans two extra times to debounce, the equivalent figure is 3 * 6 ms = 18 ms. With a 100Hz scan rate, that becomes 3 * 10 ms = 30 ms. 18 ms to 30 ms of keyboard scan plus debounce latency is in line with what we saw when we did some preliminary keyboard latency measurements.For reference, the ergodox uses a 16 MHz microcontroller with ~80k transistors and the apple 2e CPU is a 1 MHz chip with 3.5k transistors.Appendix: why should android phones have higher latency than old apple phones?As we've seen, raw processing power doesn't help much with many of the causes of latency in the pipeline, like handoffs between different processes, so phones that an android phone with a 10x more powerful processor than an ancient iphone isn't guaranteed to be quicker to respond, even if it can render javascript heavy pages faster.If you talk to people who work on non-Apple mobile CPUs, you'll find that they run benchmarks like dhrystone (a synthetic benchmark that was irrelevant even when it was created, in 1984) and SPEC2006 (an updated version of a workstation benchmark that was relevant in the 90s and perhaps even as late as the early 2000s if you care about workstation workloads, which are completely different from mobile workloads). This problem where the vendor who makes the component has an intermediate target that's only weakly correlated to the actual user experience. I've heard that there are people working on the pixel phones who care about end-to-end latency, but it's difficult to get good latency when you have to use components that are optimized for things like dhrystone and SPEC2006.If you talk to people at Apple, you'll find that they're quite cagey, but that they've been targeting the end-to-end user experience for quite a long time and they they can do "full stack" optimizations that are difficult for android vendors to pull of. They're not literally impossible, but making a change to a chip that has to be threaded up through the OS is something you're very unlikely to see unless google is doing the optimization, and google hasn't really been serious about the end-to-end experience until recently.Having relatively poor performance in aspects that aren't measured is a common theme and one we saw when we looked at terminal latency. Prior to examining temrinal latency, public benchmarks were all throughput oriented and the terminals that priortized performance worked on increasing throughput, even though increasing terminal throughput isn't really useful. After those terminal latency benchmarks, some terminal authors looked into their latency and found places they could trim down buffering and remove latency. You get what you measure.Appendix: experimental setupMost measurements were taken with the 240fps camera (4.167 ms resolution) in the iPhone SE. Devices with response times below 40 ms were re-measured with a 1000fps camera (1 ms resolution), the Sony RX100 V in PAL mode. Results in the tables are the results of multiple runs and are rounded to the nearest 10 ms to avoid the impression of false precision. For desktop results, results are measured from when the key started moving until the screen finished updating. Note that this is different from most key-to-screen-update measurements you can find online, which typically use a setup that effectively removes much or all of the keyboard latency, which, as an end-to-end measurement, is only realistic if you have a psychic link to your computer (this isn't to say the measurements aren't useful -- if, as a programmer, you want a reproducible benchmark, it's nice to reduce measurement noise from sources that are beyond your control, but that's not relevant to end users). People often advocate measuring from one of: {the key bottoming out, the tactile feel of the switch}. Other than for measurement convenience, there appears to be no reason to do any of these, but people often claim that's when the user expects the keyboard to "really" work. But these are independent of when the switch actually fires. Both the distance between the key bottoming out and activiation as well as the distance between feeling feedback and activation are arbitrary and can be tuned. See this post on keyboard latency measurements for more info on keyboard fallacies.Another significant difference is that measurements were done with settings as close to the default OS settings as possible since approximately 0% of users will futz around with display settings to reduce buffering, disable the compositor, etc. Waiting until the screen has finished updating is also different from most end-to-end measurements do -- most consider the update "done" when any movement has been detected on the screen. Waiting until the screen is finished changing is analogous to webpagetest's "visually complete" time.Computer results were taken using the default terminal for the system (e.g., powershell on windows, lxterminal on lubuntu), which could easily cause 20 ms to 30 ms difference between a fast terminal and a slow terminal. Between measuring time in a terminal and measuring the full end-to-end time, measurements in this article should be slower than measurements in other, similar, articles (which tend to measure time to first change in games).The powerspec g405 baseline result is using integrated graphics (the machine doesn t come with a graphics card) and the 60 Hz result is with a cheap video card. The baseline was result was at 30 Hz because the integrated graphics only supports hdmi output and the display it was attached to only runs at 30 Hz over hdmi.Mobile results were done by using the default browser, browsing to https://danluu.com, and measuring the latency from finger movement until the screen first updates to indicate that scrolling has occurred. In the cases where this didn t make sense, (kindles, gameboy color, etc.), some action that makes sense for the platform was taken (changing pages on the kindle, pressing the joypad on the gameboy color in a game, etc.). Unlike with the desktop/laptop measurements, this end-time for the measurement was on the first visual change to avoid including many frames of scrolling. To make the measurement easy, the measurement was taken with a finger on the touchscreen and the timer was started when the finger started moving (to avoid having to determine when the finger first contacted the screen).In the case of ties , results are ordered by the unrounded latency as a tiebreaker, but this shouldn t be considered significant. Differences of 10 ms should probably also not be considered significant.The custom haswell-e was tested with gsync on and there was no observable difference. The year for that box is somewhat arbitrary, since the CPU is from 2014, but the display is newer (I believe you couldn t get a 165 Hz display until 2015.The number of transistors for some modern machines is a rough estimate because exact numbers aren t public. Feel free to ping me if you have a better estimate!The color scales for latency and year are linear and the color scales for clock speed and number of transistors are log scale.All Linux results were done with a pre-KPTI kernel. It's possible that KPTI will impact user perceivable latency.Measurements were done as cleanly as possible (without other things running on the machine/device when possible, with a device that was nearly full on battery for devices with batteries). Latencies when other software is running on the device or when devices are low on battery might be much higher.If you want a reference to compare the kindle against, a moderately quick page turn in a physical book appears to be about 200 ms.This is a work in progress. I expect to get benchmarks from a lot more old computers the next time I visit Seattle. If you know of old computers I can test in the NYC area (that have their original displays or something like them), let me know! If you have a device you d like to donate for testing, feel free to mail it toDan LuuRecurse Center455 Broadway, 2nd FloorNew York, NY 10013Thanks to RC, David Albert, Bert Muthalaly, Christian Ternus, Kate Murphy, Ikhwan Lee, Peter Bhat Harkins, Leah Hanson, Alicia Thilani Singham Goodwin, Amy Huang, Dan Bentley, Jacquin Mininger, Rob, Susan Steinman, Raph Levien, Max McCrea, Peter Town, Jon Cinque, Anonymous, and Jonathan Dahan for donating devices to test and thanks to Leah Hanson, Andy Matuschak, Milosz Danczak, amos (@fasterthanlime), @emitter_coupled, Josh Jordan, mrob, and David Albert for comments/corrections/discussion.
A statement I commonly hear in tech-utopian circles is that some seeming inefficiency can t actually be inefficient because the market is efficient and inefficiencies will quickly be eliminated. A contentious example of this is the claim that companies can t be discriminating because the market is too competitive to tolerate discrimination. A less contentious example is that when you see a big company doing something that seems bizarrely inefficient, maybe it s not inefficient and you just lack the information necessary to understand why the decision was efficient.Unfortunately, arguments like this are difficult to settle because, even in retrospect, it s usually not possible to get enough information to determine the precise value of a decision. Even in cases where the decision led to an unambiguous success or failure, there are so many factors that led to the result that it s difficult to figure out precisely why something happened.One nice thing about sports is that they often have detailed play-by-play data and well-defined win criteria which lets us tell, on average, what the expected value of a decision is. In this post, we ll look at the cost of bad decision making in one sport and then briefly discuss why decision quality in sports might be the same or better as decision quality in other fields.Just to have a concrete example, we re going to look at baseball, but you could do the same kind of analysis for football, hockey, basketball, etc., and my understanding is that you d get a roughly similar result in all of those cases.We re going to model baseball as a state machine, both because that makes it easy to understand the expected value of particular decisions and because this lets us talk about the value of decisions without having to go over most of the rules of baseball.We can treat each baseball game as an independent event. In each game, two teams play against each other and the team that scores more runs (points) wins. Each game is split into 9 innings and in each inning each team will get one set of chances on offense. In each inning, each team will play until it gets 3 outs . Any given play may or may not result in an out.One chunk of state in our state machine is the number of outs and the inning. The other chunks of state we re going to track are who s on base and which player is at bat . Each teams defines some order of batters for their active players and after each player bats once this repeats in a loop until the team collects 3 outs and the inning is over. The state of who is at bat is saved between innings. Just for example, you might see batters 1-5 bat in the first inning, 6-9 and then 1 again in the second inning, 2- etc.When a player is at bat, the player may advance to a base and players who are on base may also advance, depending on what happens. When a player advances 4 bases (that is, through 1B, 2B, 3B, to what would be 4B except that it isn t called that) a run is scored and the player is removed from the base. As mentioned above, various events may cause a player to be out, in which case they also stop being on base.An example state from our state machine is:{1B, 3B; 2 outs}This says that there s a player on 1B, a player on 3B, there are two outs. Note that this is independent of the score, who s actually playing, and the inning.Another state is:{--; 0 outs}With a model like this, if we want to determine the expected value of the above state, we just need to look up the total number of runs across all innings played in a season divided by the number of innings to find the expected number of runs from the state above (ignoring the 9th inning because a quirk of baseball rules distorts statistics from the 9th inning). If we do this, we find that, from the above state, a team will score .555 runs in expectation.We can then compute the expected number of runs for all of the other states:table {border-collapse:collapse;margin:0px auto;}table,th,td {border: 1px solid black;}td {text-align:center;}012basesouts--.555.297.1171B.953.573.2512B1.189.725.3443B1.482.983.3871B,2B1.573.971.4661B,3B1.9041.243.5382B,3B2.0521.467.6341B,2B,3B2.4171.650.815In this table, each entry is the expected number of runs from the remainder of the inning from some particular state. Each column shows the number of outs and each row shows the state of the bases. The color coding scheme is: the starting state (.555 runs) has a white background. States with higher run expectation are more blue and states with lower run expectation are more red.This table and the other stats in this post come from The Book by Tango et al., which mostly discussed baseball between 1999 and 2002. See the appendix if you're curious about how things change if we use a more detailed model.The state we re tracking for an inning here is who s on base and the number of outs. Innings start with nobody on base and no outs.As above, we see that we start the inning with .555 runs in expectation. If a play puts someone on 1B without getting an out, we now have .953 runs in expectation, i.e., putting someone on first without an out is worth .953 - .555 = .398 runs.This immediately gives us the value of some decisions, e.g., trying to steal 2B with no outs and someone on first. If we look at cases where the batter s state doesn t change, a successful steal moves us to the {2B, 0 outs} state, i.e., it gives us 1.189 - .953 = .236 runs. A failed steal moves us to the {--, 1 out} state, i.e., it gives us .953 - .297 = -.656 runs. To break even, we need to succeed .656 / .236 = 2.78x more often than we fail, i.e., we need a .735 success rate to break even. If we want to compute the average value of a stolen base, we can compute the weighted sum over all states, but for now, let s just say that it s possible to do so and that you need something like a .735 success rate for stolen bases to make sense.We can then look at the stolen base success rate of teams to see that, in any given season, maybe 5-10 teams are doing better than breakeven, leaving 20-25 teams at breakeven or below (mostly below). If we look at a bad but not historically bad stolen-base team of that era, they might have a .6 success rate. It wouldn t be unusual for a team from that era to make between 100 and 200 attempts. Just so we can compute an approximation, if we assume they were all attempts from the {1B, 0 outs} state, the average run value per attempt would be .4 * (-.656) + .6 * .236 = -0.12 runs per attempt. Another first-order approximation is that a delta of 10 runs is worth 1 win, so at 100 attempts we have -1.2 wins and at 200 attempts we have -2.4 wins.If we run the math across actual states instead of using the first order approximation, we see that the average stolen base is worth -.467 runs and the average successful steal is worth .175 runs. In that case, a steal attempt with a .6 success rate is worth .4 * (-.467) + .6 * .175 = -0.082 runs. With this new approximation, our estimate for the approximate cost in wins of stealing as normal vs. having a no stealing rule for a team that steals badly and often is .82 to 1.64 wins per season. Note that this underestimates the cost of stealing since getting into position to steal increases the odds of a successful pickoff , which we haven t accounted for. From our state-machine standpoint, a pickoff is almost equivalent to a failed steal, but the analysis necessary to compute the difference in pickoff probability is beyond the scope of this post.We can also do this for other plays coaches can cause (or prevent). For the intentional walk , we see that an intentional walk appears to be worth .102 runs for the opposing team. In 2002, a team that issued a lot of intentional walks might have issued 50, resulting in 50 * .102 runs for the opposing team, giving a loss of roughly 5 runs or .5 wins.If we optimistically assume a sac bunt never fails, the cost of a sac bunt is .027 runs per attempt. If we look at the league where pitchers don t bat, a team that was heavy on sac bunts might ve done 49 sac bunts (we do this to avoid pitcher bunts, which add complexity to the approximation), costing a total of 49 * .027 = 1.32 runs or .132 wins.Another decision that s made by a coach is setting the batting order. Players bat (take a turn) in order, 1-9, mod 9. That is, when the 10th player is up, we actually go back around and the 1st player bats. At some point the game ends, so not everyone on the team ends up with the same number of at bats .There s a just-so story that justifies putting the fastest player first, someone with a high batting average second, someone pretty good third, your best batter fourth, etc. This story, or something like it, has been standard for over 100 years.I m not going to walk through the math for computing a better batting order because I don t think there s a short, easy to describe, approximation. It turns out that if we compute the difference between an optimal order and a typical order justified by the story in the previous paragraph, using an optimal order appears to be worth between 1 and 2 wins per season.These approximations all leave out important information. In three out of the four cases, we assumed an average player at all times and didn t look at who was at bat. The information above actually takes this into account to some extent, but not fully. How exactly this differs from a better approximation is a long story and probably too much detail for a post that s using baseball to talk about decisions outside of baseball, so let s just say that we have pretty decent but not amazing approximation that says that a coach who makes bad decisions following conventional wisdom that are in the normal range of bad decisions during a baseball season might be able cost their team something like 1 + 1.2 + .5 + .132 = 2.83 wins on these three decisions alone vs. a decision rule that says never do these actions that, on average, have negative value . If we compare to a better decision rule such as do these actions when they have positive value and not when they have negative value or a manager that generally makes good decisions, let s conservatively estimate that s maybe worth 3 wins.We ve looked at four decisions (sac bunt, steal, intentional walk, and batting order). But there are a lot of other decisions! Let s arbitrarily say that if we look at all decisions and not just these four decisions, having a better heuristic for all decisions might be worth 4 or 5 wins per season.What does 4 or 5 wins per season really mean? One way to look at it is that baseball teams play 162 games, so an average team wins 81 games. If we look at the seasons covered, the number of wins that teams that made the playoffs had was {103, 94, 103, 99, 101, 97, 98, 95, 95, 91, 116, 102, 88, 93, 93, 92, 95, 97, 95, 94, 87, 91, 91, 95, 103, 100, 97, 97, 98, 95, 97, 94}. Because of the structure of the system, we can t name a single number for a season and say that N wins are necessary to make the playoffs and that teams with fewer than N wins won t make the playoffs, but we can say that 95 wins gives a team decent odds of making the playoffs. 95 - 81 = 14. 5 wins is more than a third of the difference between an average team and a team that makes the playoffs. This a huge deal both in terms of prestige and also direct economic value.If we want to look at it at the margin instead of on average, the smallest delta in wins between teams that made the playoffs and teams that didn t in each league was {1, 7, 8, 1, 6, 2, 6, 3}. For teams that are on the edge, a delta of 5 wins wouldn t always be the difference between a successful season (making playoffs) and an unsuccessful season (not making playoffs), but there are teams within a 5 win delta of making the playoffs in most seasons. If we were actually running a baseball team, we d want to use a much more fine-grained model, but as a first approximation we can say that in-game decisions are a significant factor in team performance and that, using some kind of computation, we can determine the expected cost of non-optimal decisions.Another way to look at what 5 wins is worth is to look at what it costs to get a player who s not a pitcher that s 5 wins above average (WAA) (we look at non-pitchers because non-pitchers tend to play in every game and pitchers tend to play in parts of some games, making a comparison between pitchers and non-pitchers more complicated). Of the 8 non-pitcher positions (we look at non-pitcher positions because it makes comparisons simpler), there are 30 teams, so we have 240 team-positions pairs. In 2002, of these 240 team-position pairs, there are two that were >= 5 WAA, Texas-SS (Alex Rodriguez, paid $22m) and SF-LF (Barry Bonds, paid $15m). If we look at the other seasons in the range of dates we re looking at, there are either 2 or 3 team-position pairs where a team is able to get >= 5 WAA in a season These aren t stable across seasons because player performance is volatile, so it s not as easy as finding someone great and paying them $15m. For example, in 2002, there were 7 non-pitchers paid $14m or more and only two of them we worth 5 WAA or more. For reference, the average total team payroll (teams have 26 players per) in 2002 was $67m, with a minimum of $34m and a max of $126m. At the time a $1m salary for a manager would ve been considered generous, making a 5 WAA manager an incredible deal.5 WAA assumes typical decision making lining up with events in a bad, but not worst-case way. A more typical case might be that a manager costs a team 3 wins. In that case, in 2002, there were 25 team-position pairs out of 240 where a single player could make up for the loss caused from management by conventional wisdom. Players who provide that much value and who aren t locked up in artificially cheap deals with particular teams due to the mechanics of player transfers are still much more expensive than managers.If we look at how teams have adopted data analysis in order to improve both in-game decision making and team-composition decisions, it s been a slow, multi-decade, process. Moneyball describes part of the shift from using intuition and observation to select players to incorporating statistics into the process. Stats nerds were talking about how you could do this at least since 1971 and no team really took it seriously until the 90s and the ideas didn t really become mainstream until the mid 2000s, after a bestseller had been published.If we examine how much teams have improved at the in-game decisions we looked at here, the process has been even slower. It s still true today that statistics-driven decisions aren t mainstream. Things are getting better, and if we look at the aggregate cost of the non-optimal decisions mentioned here, the aggregate cost has been getting lower over the past couple decades as intuition-driven decisions slowly converge to more closely match what stats nerds have been saying for decades. For example, if we look at the total number of sac bunts recorded across all teams from 1999 until now, we see:1999200020012002200320042005200620072008200920102011201220132014201520162017160416281607163316261731162016511540152616351544166714791383134312001025925Despite decades of statistical evidence that sac bunts are overused, we didn t really see a decline across all teams until 2012 or so. Why this is varies on a team-by-team and case-by-case basis, but the fundamental story that s been repeated over and over again both for statistically-driven team composition and statistically driven in-game decisions is that the people who have the power to make decisions often stick to conventional wisdom instead of using radical statistically-driven ideas. There are a number of reasons as to why this happens. One high-level reason is that the change we re talking about was a cultural change and cultural change is slow. It doesn t surprise people when it takes a generation for scientific consensus to shift and why should be baseball be any different?One specific lower-level reason obviously non-optimal decisions can persist for so long is that there s a lot of noise in team results. You sometimes see a manager make some radical decisions (not necessarily statistics-driven), followed by some poor results, causing management to fire the manager. There s so much volatility that you can t really judge players or managers based on small samples, but this doesn t stop people from doing so. The combination of volatility and skepticism of radical ideas heavily disincentivizes going against conventional wisdom.Among the many consequences of this noise is the fact that the winner of the "world series" (the baseball championship) is heavily determined by randomness. Whether or not a team makes the playoffs is determined over 162 games, which isn't enough to remove all randomness, but is enough that the result isn't mostly determined by randomness. This isn't true of the playoffs, which are too short for the outcome to be primarily determined by the difference in the quality of teams. Once a team wins the world series, people come up with all kinds of just-so stories to justify why the team should've won, but if we look across all games, we can see that the stories are just stories. This is, perhaps, not so different to listening to people tell you why their startup was successful.There are metrics we can use that are better predictors of future wins and losses (i.e., are less volatile than wins and losses), but, until recently, convincing people that those metrics were meaningful was also a radical idea.If we think about the general case, what s happening is that decisions have probabilistic payoffs. There s very high variance in actual outcomes (wins and losses), so it s possible to make good decisions and not see the direct effect of them for a long time. Even if there are metrics that give us a better idea of what the true value of a decision is, if you re operating in an environment where your management doesn t believe in those metrics, you re going to have a hard time keeping your job (or getting a job in the first place) if you want to do something radical whose value is only demonstrated by some obscure-sounding metric unless they take a chance on you for a year or two. There have been some major phase changes in what metrics are accepted, but they ve taken decades.If we look at business or engineering decisions, the situation is much messier. If we look at product or infrastructure success as a win , there seems to be much more noise in whether or not a team gets a win . Moreover, unlike in baseball, the sort of play-by-play or even game data that would let someone analyze wins and losses to determine the underlying cause isn t recorded, so it s impossible to determine the true value of decisions. And even if the data were available, there are so many more factors that determine whether or not something is a win that it s not clear if we d be able to determine the expected value of decisions even if we had the data.We ve seen that in a field where one can sit down and determine the expected value of decisions, it can take decades for this kind of analysis to influence some important decisions. If we look at fields where it s more difficult to determine the true value of decisions, how long should we expect it to take for good decision making to surface? It seems like it would be a while, perhaps forever, unless there s something about the structure of baseball and other sports that makes it particularly difficult to remove a poor decision maker and insert a better decision maker.One might argue that baseball is different because there are a fixed number of teams and it s quite unusual for a new team to enter the market, but if you look at things like public clouds, operating systems, search engines, car manufacturers, etc., the situation doesn t look that different. If anything, it appears to be much cheaper to take over a baseball team and replace management (you sometimes see baseball teams sell for roughly a billion dollars) and there are more baseball teams than there are competitive products in the markets we just discussed, at least in the U.S. One might also argue that, if you look at the structure of baseball teams, it s clear that positions are typically not handed out based on decision-making merit and that other factors tend to dominate, but this doesn t seem obviously more true in baseball than in engineering fields.This isn t to say that we expect obviously bad decisions everywhere. You might get that idea if you hung out on baseball stats nerd forums before Moneyball was published (and for quite some time after), but if you looked at formula 1 (F1) around the same time, you d see teams employing PhDs who are experts in economics and game theory to make sure they were making reasonable decisions. This doesn t mean that F1 teams always make perfect decisions, but they at least avoided making decisions that interested amateurs could identify as inefficient for decades. There are some fields where competition is cutthroat and you have to do rigorous analysis to survive and there are some fields where competition is more sedate. In living memory, there was a time when training for sports was considered ungentlemanly and someone who trained with anything resembling modern training techniques would ve had a huge advantage. Over the past decade or so, we re seeing the same kind of shift but for statistical techniques in baseball instead of training in various sports.If we want to look at the quality of decision making, it's too simplistic to say that we expect a firm to make good decisions because they're exposed to markets and there's economic value in making good decisions and people within the firm will probably be rewarded greatly if they make good decisions. You can't even tell if this is happening by asking people if they're making rigorous, data-driven, decisions. If you'd ask people in baseball they were using data in their decisions, they would've said yes throughout the 70s and 80s. Baseball has long been known as a sport where people track all kinds of numbers and then use those numbers. It's just that people didn't backtest their predictions, let alone backtest their predictions with holdouts.The paradigm shift of using data effectively to drive decisions has been hitting different fields at different rates over the past few decades, both inside and outside of sports. Why this change happened in F1 before it happened in baseball is due to a combination of the difference in incentive structure in F1 teams vs. baseball teams and the difference in institutional culture. We may take a look at this in a future post, but this turns out to be a fairly complicated issue that requires a lot more background. We ll try to explore the necessary background in future posts.Appendix: non-idealities in our baseball analysisIn order to make this a short blog post and not a book, there are a lot of simplifications the approximation we discussed. One major simplification is the idea that all runs are equivalent. This is close enough to true that this is a decent approximation. But there are situations where the approximation isn t very good, such as when it s the 9th inning and the game is tied. In that case, a decision that increases the probability of scoring 1 run but decreases the probability of scoring multiple runs is actually the right choice.This is often given as a justification for a relatively late-game sac bunt. But if we look at the probability of a successful sac bunt, we see that it goes down in later innings. We didn t talk about how the defense is set up, but defenses can set up in ways that reduce the probability of a successful sac bunt but increase the probability of success of non-bunts and vice versa. Before the last inning, this actually makes sac bunt worse late in the game and not better! If we take all of that into account in the last inning of a tie game, the probability that a sac bunt is a good idea then depends on something else we haven t discussed, the batter at the plate.In our simplified model, we computed the expected value in runs across all batters. But at any given time, a particular player is batting. A successful sac bunt advances runners and increases the number of outs by one. The alternative is to let the batter swing away , which will result in some random outcome. The better the batter, the higher the probability of an outcome that s better than the outcome of a sac bunt. To determine the optimal decision, we not only need to know how good the current batter is but how good the subsequent batters are. One common justification for the sac bunt is that pitchers are terrible hitters and they re not bad at sac bunting because they have so much practice doing it (because they re terrible hitters), but it turns out that pitchers are also below average sac bunters and that the argument that we should expect pitchers to sac because they re bad hitters doesn t hold up if we look at the data in detail.Another reason to sac bunt (or bunt in general) is that the tendency to sometimes do this induces changes in defense which make non-bunt plays work better.A full computation should also take into account the number of balls and strikes a current batter has, which is a piece of state we haven t discussed at all as well as the speed of the batter and the players on base as well as the particular stadium the game is being played in and the opposing pitcher as well as the quality of their defense. All of this can be done, even on a laptop -- this is all small data as far as computers are concerned, but walking through the analysis even for one particular decision would be substantially longer than everything in this post combined including this disclaimer. It s perhaps a little surprising that taking all of these non-idealities into account doesn t overturn the general result, but it turns out that it doesn t (it finds that there are many situations in which sac bunts have positive expected value, but that sac bunts were still heavily overused for decades).There s a similar situation for intentional walks, where the non-idealities in our analysis appear to support issuing intentional walks. In particular, the two main conventional justifications for an intentional walk areBy walking the current batter, we can set up a force or a double play (increase the probability of getting one out or two outs in one play). If the game is tied in the last inning, putting another player on base has little downside and has the upside of increasing the probability of allowing zero runs and continuing the tie.By walking the current batter, we can get to the next, worse batter.An example situation where people apply the justification in (1) is in the {1B, 3B; 2 out} state. The team that s on defense will lose if the player at 3B advances one base. The reasoning goes, walking a player and changing the state to {1B, 2B, 3B; 2 out} won t increase the probability that the player at 3B will score and end the game if the current batter puts the ball into play , and putting another player on base increases the probability that the defense will be able to get an out.The hole in this reasoning is that the batter won t necessarily put the ball into play. After the state is {1B, 2B, 3B; 2 out}, the pitcher may issue an unintentional walk, causing each runner to advance and losing the game. It turns out that being in this state doesn t affect the the probability of an unintentional walk very much. The pitcher tries very hard to avoid a walk but, at the same time, the batter tries very hard to induce a walk!On (2), the two situations where the justification tend to be applied are when the current player at bat is good or great, or the current player is batting just before the pitcher. Let s look at these two separately.Barry Bonds s seasons from 2001, 2002, and 2004 were some of the statistically best seasons of all time and are as extreme a case as one can find in modern baseball. If we run our same analysis and account for the quality of the players batting after Bonds, we find that it s sometimes the correct decision for the opposing team to intentionally walk Bonds, but it was still the case that most situations do not warrant an intentional walk and that Bonds was often intentionally walked in a situation that didn t warrant an intentional walk. In the case of a batter who is not having one of the statistically best seasons on record in modern baseball, intentional walks are even less good.In the case of the pitcher batting, doing the same kind of analysis as above also reveals that there are situations where an intentional walk are appropriate (not-late game, {1B, 2B; 2 out}, when the pitcher is not a significantly above average batter for a pitcher). Even though it s not always the wrong decision to issue an intentional walk, the intentional walk is still grossly overused.One might argue the fact that our simple analysis has all of these non-idealities that could have invalidated the analysis is a sign that decision making in baseball wasn t so bad after all, but I don t think that holds. An first-order approximation that someone could do in an hour or two finds that decision making seems quite bad, on average. If a team was interested in looking at data, that ought to lead them into doing a more detailed analysis that takes into account the conventional-wisdom based critiques of the obvious one-hour analysis. It appears that this wasn t done, at least not for decades.The problem is that before people started running the data, all we had to go by were stories. Someone would say "with 2 outs, you should walk the batter before the pitcher to get to the pitcher [in some situations] to get to the pitcher and get the guaranteed out". Someone else might respond "we obviously shouldn't do that late game because the pitcher will get subbed out for a pinch hitter and early game, we shouldn't do it because even if it works and we get the easy out, it sets the other team up to lead off the next inning with their #1 hitter instead of an easy out". Which of these stories is the right story turns out to be an empirical question. The thing that I find most unfortunate is that, after started people running the numbers and the argument became one of stories vs. data, people persisted in sticking with the story-based argument for decades. We see the same thing in business and engineering, but it's arguably more excusable there because decisions in those areas tend to be harder to quantify. Even if you can reduce something to a simple engineering equation, someone can always argue that the engineering decision isn't what really matters and this other business concern that's hard to quantify is the most important thing.Appendix: possessionSomething I find interesting is that statistical analysis in football, baseball, and basketball has found that teams have overwhelmingly undervalued possessions for decades. Baseball doesn't have the concept of possession per se, but if you look at being on offense as "having posession" and getting 3 outs as "losing posession", it's quite similar.In football, we see that maintaining posession is such a big deal that it is usually an error to punt on 4th down, but this hasn't stopped teams from punting by default basically forever. And in basketball, players who shoot a lot with a low shooting percentage were (and arguably still are) overrated.I don't think this is fundamental -- that possessions are as valuable as they are comes out of the rules of each game. It's arbitrary. I still find it interesting, though.Appendix: other analysis of management decisionsBloom et al., Does management matter? Evidence from India looks at the impact of management interventions and the effect on productivity.Other work by Bloom.DellaVigna et al., Uniform pricing in US retail chains allegedly finds a significant amount of money left on the table by retail chains (seven percent of profits) and explores why that might happen and what the impacts are.The upside of work like this vs. sports work is that it attempts to quantify the impact of things outside of a contrived game. The downside is that the studies are on things that are quite messy and it's hard to tell what the study actually means. Just for example, if you look at studies on innovation, economists often use patents as a proxy for innovation and then come to some conclusion based on some variable vs. number of patents. But if you're familiar with engineering patents, you'll know that number of patents is an incredibly poor proxy for innovation. In the hardware world, IBM is known for cranking out a very large number of useless patents (both in the sense of useless for innovation and also in the narrow sense of being useless as a counter-attack in patent lawsuits) and there are some companies that get much more mileage out of filing many fewer patents.AFAICT, our options here are to know a lot about decisions in a context that's arguably completely irrelevant, or to have ambiguous information and probably know very little about a context that seems relevant to the real world. I'd love to hear about more studies in either camp (or even better, studies that don't have either problem).Thanks to Leah Hanson, David Turner, Milosz Dan, Andrew Nichols, Justin Blank, @hoverbikes, Kate Murphy, Ben Kuhn, Patrick Collison, and an anonymous commenter for comments/corrections/discussion.
In my journey to work more quickly with a project containing loads of dependencies, I’ve come across a few techniques I’ve not needed to use before. I previously wrote about How to Push to a Git Remote Branch of a Different Name — this time we’ll talk about installing a module from another repository instead […]The post How to Install a NPM Module from GitHub Branch appeared first on David Walsh Blog.
How to Push to a Git Remote Branch of a Different Name
Git is one of those tools that I’ve always known just enough about to be dangerous, and usually tend to learn new skills when I’m in a position to truly need them. Shockingly enough it has taken me roughly 15 years of using git for me to encounter the need to push to a remote […]The post How to Push to a Git Remote Branch of a Different Name appeared first on David Walsh Blog.
When I was young I remember looking at my bank book and seeing nice interest payments for cash I had in the bank. Fast forward to today and banks are giving essentially nothing for interest — your money just sits there collecting dust. In an ideal world you could put it into the stock market […]The post How to Earn Interest on Bitcoin appeared first on David Walsh Blog.
Every year I write a blog post about my goals for the year but I won’t pretend this year’s post is the same. I mean how the hell do I create realistic goals knowing what 2020 was and what 2021 inherits?! Pandemic, drastic political churn, social unrest…and none of that is related to my profession […]The post Goals For 2021 appeared first on David Walsh Blog.
Last week I tweeted all of you looking for your best JavaScript Array and Promise tricks, and as always, it didn’t disappoint — I learned quite a bit! Today’s JavaScript Promise trick is brought to you by Claudio Semeraro: how to use catch to set a default value instead of a try/catch: // Instead of […]The post Return a Default Value with Promises Using catch appeared first on David Walsh Blog.
I’ve written a number of blog posts about JavaScript tricks: Promise tricks, type conversion tricks, spread tricks, and a host of other JavaScript tricks. I recently ran into another JavaScript trick that blew my mind: how to break a forEach loop, shared by Andrea Giammarchi! To break the forEach loop at any point, you can […]The post Break a forEach Loop with JavaScript appeared first on David Walsh Blog.
Software engineering is full of jargon. Occasionally, to grasp the true meaning of the seemingly simplest of words, one must waddle through many murky layers of complexity (fancy defining this, anyone?). Thankfully, other times, outwardly inaccessible words can be demystified pretty easily. In this article, we'll deal with the latter case, breaking down pure vs impure functions.person thinking about the definition of this 1. Pure Functions To be considered pure, functions must fulfil the following criteria:they must be predictablethey must have no side effects Pure functions must be predictable.Identical inputs will always return identical outputs, no matter how many times a pure function is called. In other words: we can run a pure function as many times as we like, and given the inputs remain constant, the function will always predictably produce the same output. Kind of like when you're a pizza-loving person with lactose intolerance. No, this time won't be different, so stop ogling that 16-incher your flatmate ordered. Pure functions must have no side-effects.A side-effect is any operation your function performs that is not related to computing the final output, including but not limited to:Modifying a global variableModifying an argumentMaking HTTP requestsDOM manipulationReading/writing filesA pure function must both be predictable and without side-effects. If either of these criteria is not met, we're dealing with an impure function.An impure function is kind of the opposite of a pure one - it doesn't predictably produce the same result given the same inputs when called multiple times, and may cause side-effects. Let's have a look at some examples.// PURE FUNCTION const pureAdd = (num1, num2) => { return num1 + num2;};//always returns same result given same inputspureAdd(5, 5);//10pureAdd(5, 5);//10//IMPURE FUNCTION let plsMutateMe = 0;const impureAdd = (num) => { return (plsMutateMe += num);};//returns different result given same inputsimpureAdd(5);//5impureAdd(5);//10console.log(plsMutateMe)//10 I'm now double digit, yay!In the above example, the impure version of the function both changes a variable outside its scope, and results in different output, despite being called with identical input. This breaks both rules of pure functions and as such, it's pretty clear we're dealing with an impure function here. But let's have a look at an example of an impure function that is not so easy to tell apart from its pure counterpart.//IMPURE FUNCTION const impureAddToArray = (arr1, num) => { arr1.push(num); return arr1;};impureAddToArray([1, 2, 3], 4);//[1,2,3,4]impureAddToArray([1, 2, 3], 4);//[1,2,3,4]Given the same inputs, the function above will always return the same output. But it also has the side effect of modifying memory in-place by pushing a value into the original input array and is therefore still considered impure. Adding a value to an array via a pure function instead can be achieved using the spread operator, which makes a copy of the original array without mutating it.//IMPURE FUNCTION const impureAddToArray = (arr1, num) => { //altering arr1 in-place by pushing arr1.push(num); return arr1;};// PURE FUNCTION const pureAddToArray = (arr1, num) => { return [...arr1, num];};Let's look at how we'd add to an object instead.// IMPURE FUNCTION const impureAddToObj = (obj, key, val) => { obj[key] = val; return obj;};Because we're modifying the object in-place, the above approach is considered impure. Below is its pure counterpart, utilising the spread operator again.// PURE FUNCTION const pureAddToObj = (obj, key, val) => { return { ...obj, [key]: val };} Why should I care?If the differences in the above examples seem negligible, it's because in many contexts, they are. But in a large-scale application, teams might choose pure over impure functions for the following reasons:Pure functions are easy to test, given how predictable they arePure functions and their consequences are easier to think about in the context of a large app, because they don't alter any state elsewhere in the program. Reasoning about impure functions and potential side-effects is a greater cognitive load. Pure functions can be memoized. This means that their output, given certain inputs, can be cached when the function first runs so that it doesn't have to run again - this can optimise performance.The team lead is a Slytherin obsessed with the purity status of both blood and functions (are we too old for HP references? I think not).Pure functions are also the foundation of functional programming, which is a code-writing paradigm entire books have been written about. Moreover, some popular libraries require you to use pure functions by default, for example React and Redux. Pure vs Impure JavaScript MethodsCertain JS functions from the standard library are inherently impure. Math.random()Date.now()arr.splice()arr.push()Conversely, the below JS methods are considered pure. arr.map() arr.filter()arr.reduce() arr.each() arr.every() arr.concat()arr.slice()Math.floor()str.toLowerCase()the spread syntax ... is also commonly used to create copies 1. ComparisonSo who comes out as a winner in this battle between good and evil? Actually, nobody. They simply have different use cases, for example, neither AJAX calls, nor standard DOM manipulation can be performed via pure functions. And impure functions aren't intrinsically bad, they just might potentially lead to some confusion in the form of spaghetti code in larger applications.Sidenote: I resent the widely held sentiment that the word spaghetti should ever be associated with anything negative. Get in my tummy and out of coding lingo, beloved pasta. I'll leave you with a quick tl;dr comparison table. Pure Functions Impure Functions no side-effectsmay have side-effectsreturns same result if same args passed in no matter how many times it runsmay return different result if same args passed in on multiple runsalways returns somethingmay take effect without returning anythingis easily testablemight be harder to test due to side-effectsis super useful in certain contextsis also super useful in certain contexts
How React isn't reactive, and why you shouldn't care
If the title agrees with you, you can stop reading right now. Move on to the next article. In technology, we tend to grab on to differences to come up with easily identifiable discussion points even when the truth is less clear-cut.So save yourself some time and move on if you don't want to put some mostly unnecessary information in your head. But if you are interested in this sort of thing let me give this a shot. What is reactive programming?This is the heart of it. If there was ever a more overloaded term... Reactive programming refers to a great number of things and most definitions are pretty poor. Either too specific to a mechanism or too academic. So I'm going to take yet another stab.Reactive Programming is a declarative programming paradigm built on data-centric event emitters.There are two parts to this. "Declarative programming paradigm" means that the code describes the behavior rather than how to achieve it. Common examples of this are HTML/templates where you describe what you will see rather than how it will be updated. Another is the SQL query language where you describe what data you want rather than how to fetch it.SELECT name FROM customersWHERE city = "Dallas"ORDER BY created_at DESCThis paradigm can apply to data transformation as well and is often associated with functional programming. For example, this map/filter operation describes what your output is rather than how you get there.const upperCaseOddLengthWords = words .filter(word => word.length % 2) .map(word => word.toUpperCase());The second part is "data-centric event emitter". We've all worked in systems with events. DOM has events for when the user interacts with Elements. Operating systems work off event queues. They serve as a way to decouple the handling of changes in our system from the actors that trigger them. The key to a reactive system is the actors are the data. Each piece of data is responsible for emitting its own events to notify its subscribers when its value has changed. There are many different ways to implement this from streams and operators to signals and computations, but at the core, there is always this event emitter. Common types of reactivityThere are 2 distinct common types of reactivity. They evolved to solve different problems. They share the same core properties but they are modeled slightly differently. 1. Functional Reactive Programming (FRP)This is probably the one you hear about the most but isn't necessarily the most used. This one is based around async streams and processing those with operators. This is a system for transformation. It is ideal for modeling the propagation of change over time.Its most famous incarnation in JavaScript is RxJS and powers things like Angular.const listener = merge( fromEvent(document, 'mousedown').pipe(mapTo(false)), fromEvent(document, 'mousemove').pipe(mapTo(true))) .pipe(sample(fromEvent(document, 'mouseup'))) .subscribe(isDragging => { console.log('Were you dragging?', isDragging); });You can see this stream build in front of you. You can describe some incredibly complex behavior with minimal code. 2. Synchronous Reactive Programming (SRP)Also known as fine-grained reactive programming. This is the one often associated with spreadsheets or digital circuits. It was developed to solve synchronization problems. It has little sense of time but ensures glitchless data propagation so that everything is in sync. It is built on signals and auto-tracking computations instead of streams and operators. Signals represent a single data point whose changes propagate through a web of derivations and ultimately result in side effects.Often you use these systems without realizing it. It is the core part of Vue, MobX, Alpine, Solid, Riot, Knockout.import { observable, autorun } from "mobx"const cityName = observable.box("Vienna")autorun(() => { console.log(cityName.get())})// Prints: 'Vienna'cityName.set("Amsterdam")// Prints: 'Amsterdam'If you look, cityName's value looks like it is actually being pulled instead of pushed. And it is on initial execution. These systems use a hybrid push/pull system, but not for the reason you might think. It is to stay in sync.Regardless of how we attack it, computations need to run in some order, so it is possible to read from a derived value before it has been updated. Given the highly dynamic nature of the expressions in computations topological sort is not always possible when chasing optimal execution. So sometimes we pull instead of push to ensure consistency when we hit a signal read.Also worth mentioning: Some people confuse the easy proxy setter as being a sure sign something is reactive. This is a mistake. You might see city.name = "Firenze" but what is really happening is city.setName("Firenze"). React could have made their class component state objects proxies and had no impact on behavior.Which brings us to... Is React not reactive?Well, let's see about that. React components are driven off state, and setState calls are sort of like data events. And React's Hooks and JSX are basically declarative. So what's the issue here?Well actually very little. There is only one key difference, React decouples the data events from component updates. In the middle, it has a scheduler. You may setState a dozen times but React takes notice of which components have been scheduled to update and doesn't bother doing so until it is ready.But all of this is a type of buffering. Not only is the queue filled by the state update event, but the scheduling of processing that queue is as well. React isn't sitting there with some ever-present polling mechanism to poll for changes. The same events drive the whole system.So is React not reactive? Only if you view reactivity as a push-only mechanism. Sure React's scheduling generally doesn't play as nice with push-based reactive systems as some would want but that is hardly evidence. It seems to pass the general criteria. But it is definitely not typical FRP or SRP. Know what else isn't? Svelte. Strawman ArgumentWhen you update a value in Svelte in an event handler and happen to read a derived value on the next line of code it isn't updated. It is definitely not synchronous.<script> let count = 1; $: doubleCount = count * 2;</script><button on:click={() => { count = count + 1; console.log(count, doubleCount); // 2, 2}}>Click Me</button>In fact, updates are scheduled batched and scheduled similarly to React. Maybe not interruptable like time-slicing but this doesn't fit cleanly into FRP or SRP. In fact, most frameworks do this sort of batching. Vue as well when talking about DOM updates. Set count twice synchronously and sequentially doesn't result in Svelte updating the component more than once.Taking it a step further, have you seen the compiled output of this? The important parts look like this:let doubleCount;let count = 1;const click_handler = () => { $$invalidate(0, count = count + 1); console.log(count, doubleCount); // 2, 2};$$self.$$.update = () => { if ($$self.$$.dirty & /*count*/ 1) { $: $$invalidate(1, doubleCount = count * 2); }};Unsurprisingly $$invalidate is a lot like setState. Guess what it does? Tell the component to call its update function. Basically exactly what React does. There are differences in execution after this point due to differences in memoization patterns and VDOM vs no VDOM. But for all purposes, Svelte has a setState function that re-evaluates its components. And like React it is component granular, performing a simple flag-based diff instead of one based on referential value check.So is Svelte not reactive? It has all the characteristics we were willing to disqualify React for. SummaryThis whole line of argument is mostly pointless. Just like the argument of JSX versus custom template DSLs. The difference in the execution model can be notable. But Svelte's difference isn't due to reactivity but because its compiler separates create/update paths allowing skipping on a VDOM.React team acknowledges that it isn't fully reactive. While that seems like it should be worth something, in practice it isn't that different than many libraries that claim to be reactive. Sure, React Fiber takes scheduling to the extreme, but most UI Frameworks automatically do some amount of this.Reactivity isn't a specific solution to a problem, but a way to model data change propagation. It's a programming paradigm. You can model almost any problem with reactive approaches. And the sooner we treat it as such the sooner we can focus on the problems that matter.
How to create simple multi-step sign in with validation
IntroductionLet's say you need to create a multi-step login form like in gmail. You are using react and the global storage (redux, mobx) for development, and you want to isolate components from each other in order to reuse them in the future. Besides this, you need to add validation to each step. In this article I will show the simplest and most correct, in my opinion, solution. Complete solution you can check here DependenciesFirst of all, we need a library for processing the form, in my opinion the best solution is react-hook-forms (https://react-hook-form.com/), the site describes in great detail why this is an excellent solution, i will add on my own that this library has powerful functionality (validations, quick integrations, controller mechanism) and good documentation.For validation we will use the yup library, it's very powerful and popular libraryFor global storage i will use little-state-machine, because it's very simple solution and built on a flux architecture. But you can use redux or mobxTo integrate yup validation schemas with react-hook-form you will also need @hookform/resolvers package. Let's code Project StructureThe example uses the following project structuresteps <- here will be all form stepsCongrats.js <- final step, if sign in is successedEmail.js <- First step, enter email to continue sign inPassword.js <- Second step, enter password to sign instoreactions.js <- include all actions, in my case only one for update form stateindex.js <- include app state, in my case only form stateApp.js <- Main component, in my case include form logicindexApp.css <- App styles About storeIn the storage we will store information about the step of the form and email data. Let's add this information in store/index.jsconst state = { step: "Email", email: ""};export default state;Now let's add an action to update the form in actions.jsconst updateFormState = (state, payload) => { return { ...state, ...payload };};export default updateFormState;Let's add our storage to the application in index.jsimport { StrictMode } from "react";import ReactDOM from "react-dom";import App from "./App";import { StateMachineProvider, createStore } from "little-state-machine";import store from "./store";// create out global form statecreateStore(store);const rootElement = document.getElementById("root");ReactDOM.render( <StrictMode> <StateMachineProvider> <App /> </StateMachineProvider> </StrictMode>, rootElement); Base logicThe logic for switching the form, as well as its handlers, will be in App.js (for example only). We need to connect the store to the component in order to receive information about the form and update it.import "./styles.css";import { useStateMachine } from "little-state-machine";import updateFormState from "./store/actions";// Here we import form stepsimport EmailStep from "./steps/Email";import CongratsStep from "./steps/Congrats";import PasswordStep from "./steps/Password";export default function App() { // use hook for getting form state and actions const { state, actions } = useStateMachine({ updateFormState }); // form handler for email step const emailFormHandle = ({ email }) => { actions.updateFormState({ email: email, step: "Password" }); }; // form handler for password step const passwordFormHandle = ({ password }) => { actions.updateFormState({ step: "Congrats" }); }; // sign out handler const signOutHandle = () => { actions.updateFormState({ step: "Email" }); }; return ( <div> {state.step === "Email" && ( <EmailStep email={state.email} onSubmit={emailFormHandle} /> )} {state.step === "Password" && ( <PasswordStep onSubmit={passwordFormHandle} /> )} {state.step === "Congrats" && ( <CongratsStep email={state.email} onSignOut={signOutHandle} /> )} </div> );}javascriptForm step components are isolated from each other as much as possible, and can be reused in other parts of the application. All you need is only add default values, if they exists (for email step) and form handler function. Steps EmailThe email entry step is the first step for user authorization. It is necessary to check the validity of the entered email, and remember it in case the user at the step with the password wants to go back and change it a little. This may seem very far-fetched, but when there are a lot of inputs in form, saving their state is very useful to save the user's time. Code with comments over here:import { useForm } from "react-hook-form";// import our validation libraryimport * as yup from "yup";// import integration libraryimport { yupResolver } from "@hookform/resolvers/yup";import cn from "classnames";// validation schemaconst Schema = yup.object().shape({ // it says here that we want to check the input with the name email for the fact that the user will pass a string and this string matches email, you can change validation error message by changing text in email function argument email: yup.string().email("Enter valid email please")});const EmailStep = (props) => { // get form on Submit handler from parent component const { onSubmit, email } = props; // apply validations schema to react-hook-form form object const { errors, register, handleSubmit } = useForm({ resolver: yupResolver(Schema), // if user input his email before we can paste it to input as default value defaultValues: { email } }); // you can check all validations errors in console console.log(errors); return ( <form onSubmit={handleSubmit(onSubmit)}> <div className="form-group"> <h2>Enter your email</h2> </div> <div className="form-group"> {/* check validation errors */} {errors.email && ( <h4 className="invalid-msg">{errors.email.message}</h4> )} <input // make input invalid if get email validation errors className={cn(errors.email && "input-invalid")} name="email" ref={register} placeholder="Your email" /> </div> <div className="form-group"> <button type="submit">Next</button> </div> </form> );};export default EmailStep;What you need to know:Form validation will be apply after user click on submit button (Next button in my case), but you can change this behavior in form optionsAll validation errors are in the error object, which is generated by react-hook-form, the key is input name (email) and value is validation message (Enter valid email please)You can use the default validation rules by react-hook-form form object, without any libraries, but yup is more powerful and flexible package. Password stepThe last step in user authorization. The password should be more that 6 symbols length and include Latin letters. The code is below:import { useForm } from "react-hook-form";import * as yup from "yup";import { yupResolver } from "@hookform/resolvers/yup";import cn from "classnames";const Schema = yup.object().shape({ password: yup .string() .min(6, "Password is too short") .matches(/[a-zA-Z]/, "Password can only contain Latin letters.")});const PasswordStep = (props) => { const { onSubmit } = props; const { errors, register, handleSubmit } = useForm({ resolver: yupResolver(Schema) }); console.log(errors); return ( <form onSubmit={handleSubmit(onSubmit)}> <div className="form-group"> <h2>Enter your password</h2> </div> <div className="form-group"> {errors.password && ( <h4 className="invalid-msg">{errors.password.message}</h4> )} <input className={cn(errors.password && "input-invalid")} name="password" type="password" ref={register} placeholder="Your password" /> </div> <div className="form-group"> <button type="submit">Sign In</button> </div> </form> );};export default PasswordStep; Final stepAnd finally let's show user congrats messageconst CongratsStep = (props) => { const { email, onSignOut } = props; return ( <div className="form-group"> <h2> Hello, {email} <button onClick={onSignOut}>Sign Out</button> </h2> <img src="https://i.giphy.com/6nuiJjOOQBBn2.gif" alt="" /> </div> );};export default CongratsStep; ConclusionThat's all. We create isolated form steps, add default values for email value, add validation rules to every form step and use for this most powerful and popular packages (excluding little-state-machine).If you interested i can show this examples with typescript, MUI and mobx or redux packages P.S.This is my first article, and english is not my native language, hope everything was clear and you had a pleasant time :) If you have problems with understanding the text (due to the fact that I do not know the language well), you can always look at my code, it says much more than any words
New to version control? Welcome! Understanding the lingo is very important. This can be overwhelming, but don t worry, you ll get there!In this short post I will explain what a branch and what a tag is, what they are used for and the differences between them....As defined in gitglossary:branchA branch is an active line of development. The most recent commit on a branch is referred to as the tip of that branch. The tip of the branch is referenced by a branch head, which moves forward as additional development is done on the branch. A single git repository can track an arbitrary number of branches, but your working tree is associated with just one of them (the current or checked out branch), and HEAD points to that branch.tagA ref pointing to a tag or commit object. In contrast to a head, a tag is not changed by a commit[...]. A tag is most typically used to mark a particular point in the commit ancestry chain. Branches:Let s explain how this works in real life.You write code on a branch. You may have a repository with only one branch (master), and then code you commit would be added to that branch. A more common workflow is to create (checkout) a new branch when working on a feature or a bug, stemming from a master or develop branch. When your work is completed, saved (committed) and pushed remotely, hopefully your code will be reviewed and merged into the main development branch.When you checkout a branch, it points to the most recent commit that you have locally. Branches are dynamic and code can be added to them. Tags:A tag points to a specific commit on any branch. You cannot add more code to a tag it is a reference to a specific commit, kind of like a snapshot.When would you want something like this? It is useful to create tags when releasing versions. When checking out a tag you can always be sure you ll be getting the same code each time. In conclusion:A branch is an active line of development whereas a tag is a an immutable reference to a specific commit on a branch....Hope that clears up some confusion for you. Happy developing!
One thing that happens when you're developing projects, no matter big or small it is, is burnout. You work on a project for hours, days, weeks, months, or even years and you hit those points (or point if you've given up) of just feeling like it's not worth it, or that the project is too big or tedious for someone like you. Today we're going to be talking about that feeling and how to escape it, and how to stay motivated through projects, as well as learn from people in other projects how they push through it.Motivation or "willpower" as some people call it is the ability to push through a project, doing whatever necessary to see the project's success, though sometimes in life we get to a point where it just doesn't feel worth it, you burn yourself out, or the project seems too big or tedious, you get stressed out, etc, etc. So how do you push through it? Well, that's easier said than done. A while ago I asked Andrew Kelley (the creator of the programming language zig) how he doesn't get burnt out, let's take a look at their response and analyze it a bit.One of the big tricks is to allow myself to bounce around to different aspects of the project at the whim of my motivation.Another trick is that I get motivated by trying to "unblock" other people from contributing. A lot of what I do is to work on areas that unlock aspects of the project so that other people can contribute, and I get energized by seeing people use zig, whether happy, or struggling, either way it makes me want to do more.This was really inspiring to me, and in my opinion really good advice: work on different parts of the project, and use people using your project, either struggling or not, as a form of motivation. Doing small things like this, I could certainly see how it helps. Sometimes you just need that type of convincing or "push" to per-say keep you moving forward, and seeing people use or acknoledge your project is definitely rewarding, especially at the large scale.I also asked someone who works at the company Discord how they don't get burnt out working on such a large scale project. This is quite larger than something like zig, with millions to billions of people using it daily. Let's go ahead and take a look at what they had to say about the question.Ok so, to your question, how do i not burn out? The answer is, i do deal with burn out, probably more now than ever, since im a tech lead rather than just a team member.But the truth is, i love what i do and i love working on discord. it matters a lot to people, so that does motivate me to ensure we ship good features, i know that even if i work my ass off and stress myself out over something, millions of people's lives will be impacted (hopefully for the better).We start to see a common theme here, people are motivated by other people, they see people using their project(s), they see people happy or struggling, and they use that as a form of motivation to continue working on the project. But there's also another key factor: they love working on it. Working on a project, no matter big or small, if you do not enjoy working on it, then it's not going to be easy, or maybe even even worth your time. So what bullet points can we take from this, and how can we stay motivated during burnout and working on projects?Work on something you enjoy working onUse the fact people use your project (either struggling, or happy with it) as a "push"Vary what parts of a project you're working on, so it doesn't become repetitiveBut there's one more bullet point I've found that works myself not mentioned in these responses: work on something you would use, don't work on something or make something purely for the fact someone else might use it, as in the end, let's say no one does, if you make it for other people and no one uses it, you're going to be let down, depressed, and/or upset with yourself because no one is using it, on the other hand making something you would personally use, at least you would be. This is a very important thing to keep in mind, Discord for example was originally made because the founders were gamers and wanted a place to chat, made for gamers, and it later branched out into something even greater, Zig was made because the creator saw flaws in other languages and wanted to make something they and other people could use that's robust and reliable while keeping simplicity.So in the end? The things to take away from this are: Work on something you enjoy, use your project's community and/or userbase as a form of inspiration and motivation, vary your workloads and parts of the project to prevent things from getting boring and repetitive, and make or work on something you as an individual would personally use, with other people as a sort of 'side' motivation.
This article was originally posted on my blog.Sometimes an array of data needs to be submitted in an HTML form. A form to send invites to a bunch of users for example. That'd just be a series of email fields; and in the controller; we'd want an array of the emails to iterate over and send invites. Rails form helpers are set up to have a key and value for each field, so it's not immediately obvious how to send an array of data. But in Rails, there's always a way! If we append [] to the name of a field, it'll be parsed as an array! So for an array of emails to send invites; setting name as invites[] on all the email fields and gives us an array in the controller. To generate the form, we can use the following code:<%= form_with(scope: :invites, url: invite_path) do |form| %> <% 3.times do %> <!-- id has to be nil to prevent clashes as all fields would have the same id --> <%= form.email_field nil, id: nil, placeholder: "Email Address" %> <% end %> <%= form.submit %><% end %>Passing nil as the first argument to email_field instead of a name gives us the [] we need for an array. The generated markup looks like:<form action="/invite" accept-charset="UTF-8" data-remote="true" method="post"> <input type="hidden" name="authenticity_token" value="...."> <input placeholder="Email Address" type="email" name="invites[]"> <input placeholder="Email Address" type="email" name="invites[]"> <input placeholder="Email Address" type="email" name="invites[]"> <input type="submit" name="commit" value="Save Invites" data-disable-with="Save Invites"></form>On the controller, we can use the submitted data as:def create invited_users = params.require(:invites) p invited_users # => ["a@example.com", "b@example.com", "c@example.com"]endTo permit an array parameter in a nested form field, we can use the following pattern:params.require(:product).permit(tags: [])It's a little unintuitive but submitting arrays from a form can be a useful tool in your toolbox!
Recently I started a site of tools for boardgames players. Simple tools like dice and spinners.I wanted the site to be as accessible as possible. So I challenged myself with some rules on how it would work.One rule was that every tool must work without javascript.I learned a lot by doing it, and started to write posts about building tools without js. But before I write any more I wanted to answer the question:Why in 2021 would you bother making a website without js?While researching this post I found two really great sources of information. So, most of this is going to be stolen from this article by Adam Silver and this post from gov.uk.But I'm going to go a little deeper into why some people block js.The obvious answer to why you should build a website that doesn't need js is because some people don't use js. But how many?! How many visitors don't use javascript?The answer to this question is roughly 1%.There's not a lot of information on this but here's what I found:A 2010 study by yahoo suggests 1.3% web.archiveA 2013 study by gov.uk suggests 1.1% gov.ukFor buzzfeed in 2018 it was 1% youtube1% sounds like a lot! is it really possible 1 in 100 people block javascript? well...noThe 1% from these studies is 1% of visits where javascript has failed for any reason. According to gov.uk the number of people who actively block js (or use a really really old browser) is 0.2% 1 in 500.Those 0.2% have their reasons, but first let's look at the 0.8% of visits where the js fails. Why does javascript fail?There are lots of reasons your site's javascript might fail:Your javascript is broken! It happens.A feature you're using doesn't work on an older browser. e.g. ES6 on an old version of internet explorer.Inteference from a browser extension. Some web-extensions alter your site's code - with negative effects.Network Errors. Sometimes things just break.Mobile users losing signal - e.g. from being in a rural area, going through a tunnel, falling down a manhole, etcSome browsers block javsacript on slow connections. Android does thisCDN going down. in 2017 AWS went down for 3 hoursCorporate or local blocking or stripping of Javascript. Sometimes organizations block javascript for security reasons.ISPs accidentally blocking your CDN Sky Broadband once blocked jQuerymobile networks altering your content and breaking it T-mobile and Orange also broke jQuery!There's probably other reasons too.That accounts for about 0.8% of visitors not using JavascriptBut what about the 0.2% who block js? Why do people block javascript?Some people block javascript in their browser. Some people choose a browser that doesn't support javascript. There are a number of reasons why:AccessibilitySecurityPrivacyCostBandwidthCPUBatteryare stuck with or prefer an very old or text-based browserthey just like the web without javascript. AccessibilitySome people find it easier to navigate the web with javascript switched off. There's less distractions.Others choose text-to-speech browsers that don't support js.Text-to-speech can work fine with javascript. For instance voiceOver on MacOS works within any browser. SecurityMany people disable js for security reasons, both professional and personal. No javascript means no malicious javscript too.Who does this?People who work with sensitive or valuable data. Journalists and whistleblowers. Edward Snowden recommends switching off jsCautious people who don't want to get their credit cards stolen. PrivacyLots of people don't like corporations collecting their personal data. You might block ads, and tracking scripts.Some people take that a step further and block all javascript. Then, if they trust a site, they'll allow it to run. Cost & BandwidthBlocking javascript can save a lot of money. Downloading d3.js (a popular graphing library) costs 1 cent in Canada. In Mauritania it costs 0.06% of the average daily income. That's may not seem like a lot. But d3.js is only 90kB and only one of many scripts someone may have to download to use a site.Then javascript can request all kinds of data, images, video, and it adds up fast. Once you've visited a few sites you may find yourself over budget.The same logic applies for people with limited bandwidth. dev.to costs 24 cents to visit on mobile in canada! CPU and BatterySome people switch off javascript to save CPU and Battery. Users of low-powered devices or one that's doing more important tasks in the background may want to take pressure off their CPU. People without easy access to a power supply may want to save battery. Out-dated Browsers & Text-based browsersVery old browsers like IE < 3, Netscape 1, Mosaic, and others don't support javascript. Almost nobody uses these browsers anymore but you can bet somebody is.Some text-based browsers like Lynx don't support js. Lynx is a browser that runs in terminal applications. So someone browsing the web on a computer without a GUI may well be using it.Lynx has been around since 1992 and is still updated today. So people are definitely using it. Some people just prefer the web without js.Some people think the web is better browsed with javascript off. It's faster and reduces distractions. See I Turned Off JavaScript for a Whole Week and It Was Glorious - Wired 2015 Should you cater to 0.2%?Yes and no. Personally, I enjoy going out of my way to make things work. I find all this stuff fascinating. But making sure a site works for the 0.2% of people who disable javascript isn't really the point. The Curb Cut EffectAn analogy that comes up often when talking about web accessibility is curb cuts. Curb cuts are the small concrete ramps on the side of the road.Curb cuts we're added to sidewalks after a long campaign from disability rights activists. Their purpose was to give wheel-chair users the same freedoms non-disabled people enjoy.Now that curb cuts are everywhere everyone benefits from them. People with strollers, skateboarders, people delvering packages, and more.The point? Making the world more accessible for one group of people benefits everyone. That's the curb cut effect.Here's a great episode of 99% invisible about curb cuts.Building sites that function well without javascript doesn't just benefit the 0.2% of people who disable it.It improves the 0.8% of vists where javascript fails too.Building everything you can without js will make your site:fastersmaller (most of the time)more reliablemore accessiblehave smoother animationseasier to index by search enginesless vulnerable to hackseasier to develop personal opinionI'd prefer to write js all day but finding html and css only solutions has made me a better developer.It's forced me to find creative ways of solving problems and to learn new html and css features.Then when I do add javascript, it's ends up being a lot smaller and easier to manage. How to disable JavascriptIf you're going to build sites without js you're going to need to test them. Probably the most popular way of disabling javascript is with the browser extension NoScript it's available on Google Chrome and Firefox and elsewhere too.As of writing this noScript has:100,000+ users on Google Chrome404,376 users on FirefoxThat's at least half a million people who use that specific app. There are many more apps and other methods of disabling js. AnywayGive browsing the web without js a try, or maybe even have a go at using Lynx. Let me know what you think.
Svelte Quick Tip: Connect a store to local storage
Local storage, oh my Here's a really quick tip for you today; how to use Svelte stores to keep data in-sync with local storage.This is particularly useful if you're wanting to persist some user values, say UI configuration (e.g. their preferred theme, something that is shown/hidden, etc) and have the settings retained for future sessions.Doing this with Svelte is pretty trivial, let's check it out Create the storeAll we need to do to connect to local storage is create a writable store and then set a default value based on local storage and on any change (via subscribe) we update the local storage entry.// src/stores/content.jsimport { writable } from 'svelte/store'// Get the value out of storage on load.const stored = localStorage.content// or localStorage.getItem('content')// Set the stored value or a sane default.export const content = writable(stored || 'Hello, World!')// Anytime the store changes, update the local storage value.content.subscribe((value) => localStorage.content = value)// or localStorage.setItem('content', value)The key thing to remember here is local storage always stores strings, so if you're storing something else, say a boolean or some JSON, then you will want to convert to/from the data type you want and the local storage string representation.For example, if you wanted to store a boolean, it would look more like this:// src/stores/enabled.tsimport { writable } from 'svelte/store'const stored = localStorage.enabledexport const enabled = writable<boolean>(stored ? stored === 'true' : true)enabled.subscribe((value) => localStorage.enabled = String(value))Notice that we read the value and compare it to the string 'true' versus treating it like a boolean, which won't work. Also note that we need to convert it to a string before saving it to local storage (especially if we're using Typescript). Use your storeNow you can use the value in your component:<script> import { content } from "./store"</script><p>{$content}</p><input bind:value={$content} />Any time you update the value it will be updated in local storage and when you reload it will automatically be set to the value you had set last. Pretty neat! That's it!I told you it would be quick Hopefully this comes in handy for you, cheers! EDIT: Thanks to @lukeed for pointing out you can do localStorage['content'] (or localStorage.content) instead of the more verbose localStorage.getItem('content') and localStorage.content = '...' instead of localStorage.setItem('content', '...')EDIT 2: Shoutout to Jamie Birch on Twitter who mentioned it might be safer to stick with getItem and setItem since they're specifically declared int the local storage spec. It seems safe enough to use the property accessors, but if you want to be extra safe, use getItem and setItem.Thanks for reading! Consider giving this post a , or to bookmark it for later. Have other tips, ideas, feedback or corrections? Let me know in the comments! Don't forget to follow me on Dev.to (danawoodman), Twitter (@danawoodman) and/or Github (danawoodman)!Photo by Joshua Aragon on Unsplash
In this article I will be giving you an introduction about Kubernetes. You can watch the full video on Youtube:So what we will cover today:Life before containerisationContainer deploymentWhat is Container orchestrationWhat is KubernetesWhy use KubernetesSome benefits of KubernetesArchitecture of KubernetesPlease like, share and subscribe if you like the video. It will really help the channel Life before Containerisation :There are 2 main ways to deploy applications before utilising containers: Single Server deploymentAll applications, services where hosted under a single server. Which mean the server didn't have the single responsibility implementation. The server was jam-packed with everything, multiple applications run on a server, there can be instances where one application would take up most of the resources, and as a result, the other applications would underperform.AdvantagesEasy to hostEasy to maintainDisadvantagesVery hard to managevery hard to scaleIssues with resource allocation for all of the different services VM deploymentAs a solution to Single Server Deployment, virtualisation was introduced. It allows you to run multiple Virtual Machines (VMs) on a single physical server's CPU. Every application has its own Virtual Machine, which mean complete separation of concerns. This implementation still remain a valid option in todays world. We can implement scalability and resource implementation in a much better way then Single Server DeploymentAdvantagesLess downtime: in case 1 service is down the other ones still functionsBetter load handlingBetter resource allocationDisadvantagesDifficult to manage environementSecurity risksRequire a lot of resourcesConsistency of the VM images, which might cause error with the application Container deploymentContainers are Similar to VM, but they have relaxed isolation properties. Which mean unlike the VM containers share the OS and they utilise its kernel to run.AdvantagesCI/CD continuous integration/continuous deployment pipelinesEnvironment consistencyResource isolationResource utilisationLoosely coupled applicationsDisadvantagesLearning curve to implementComplex structure of simple applicationMigration from VM to container deployment is time consuming What is Container OrchestrationWith Docker we can run a single instance of an image with the docker run command. What happens when we the number of user increases an a single instance is not enough. One way of doing it is running docker run commands multiple times.We need to keep an eye on what containers are running and what are the states of these containers and in case there is a failure we need to re initiate a new docker run so we can have a new instance of the application. Another aspect is the health of the docker host. What happens if the docker host crash and we cannot initiates it, the containers running on that host will become inaccessible as well.We need a dedicated Engineer who keeps monitoring the containers and their health. Container Orchestration: a solution that consist of tools and scripts that will help us host containers in production environment. it consist of multiple docker host that can host containers if one of them fails the other ones is still accessible to the others.it allows us to deploy hundred of container instances of our applications in a single commandAllow us to scale up and scale down base on what we needAllow advance networking between containersLoad balancing across different requestsWith docker docker CLI single instance in single commandwith Kubernetes kubectl 1000 instance in single command What is KubernetesKubernetes is an open-source Container Orchestration platform for managing containerised workloads and services like Docker, that facilitates both declarative configuration and automation. We can script and automatically allocate resources to nodes inside the Kubernetes environment. it allow the infrastructure to run much more effectively and efficient. It takes care of scaling and failover for your application, provides deployment patterns, and more.Kubernetes are referred to as k8s Why use KubernetesKubernetes has been built to provide us with a reliable infrastructure. Its a tool that help us manage the containers that we have. It has a modular in architecture which makes it easy to maintain and scal.it give us the ability to deploy and update application at scale is the main reason of why it was created. It allows us to deploy our application to thousands of instances.At its core, Kubernetes allows us to remove the manual process that we do in hosting and managing containers. Benefits of KubernetesHighly portable and 100% open-source: Kubernetes is compatible across platforms. And its managed by the cloud native computing foundationWorkload scalability: k8s is very efficient, new instances are being added and removed easily and without any down time. It handle all container scaling with ease.High availability: k8s is designed to tackle the availability of both containers and infrastructure, it tackle the requirement of having an environment where it is highly efficient and highly energised and it will be available.Designed for deployment: speed up the ability to test, deploy and manage different phases of the deployment lifecycle.Service discovery and load balancing: k8s can expose a container using DNS or IP. If there is high traffic load balancing is handled by the k8s cluster and it distribute the traffic across the network to make the deployment stableStorage Orchestration: k8s gives the ability to use local storage or any cloud storage. Local storage can be an SSD that the k8s is running on, or if k8s is connected to public cloud like Azure, AWS we can utilise the cloud storage and utilise all of the security features the cloud is giving us.Self healing environment: in case there is a failure or a container is not responding anymore, k8s will detect that behaviour and k8s will restart the process or kill that container and initiate a new one.Automated rollout and rollbacks: Desired state of containers can be describe using k8s. The actual state of a container changes to the desire state at a controlled rate. So we can roll forward or rollback easilyAutomatic bin packing: We can actually specify the compute power that is being used from CPU and RAM each container will need.Secret and configuration management: k8s lets you store and manage sensitive information, such as passwords, OAuth tokens, and SSH keys. You can deploy and update secrets and application configuration without rebuilding your container images, and without exposing secrets in your stack configuration. Kubernetes ArchitectureA k8s cluster consist of a set of nodes, a node is a machine, physical or a VM on which the Kubernetes software is setup. A node is a worker machine and its where our containers will be launch by k8s. what happens if one of our machines that is running our containers fails, we need another machine to keep the application running. A cluster is a set of nodes grouped together, to keep the application up Now who is responsible on managing the cluster, who is going to be sending information to the cluster about the containers and configuration, how the nodes handle failures, logs. The k8s architecture is a clustered based architecture and it revolves around 2 key areask8s master: which controls all of the activity within the k8s infrastructurenodes: which are linux environments which are controlled by the masterk8s Master Node The master node is a node with k8s control plain components installed, the master watches over the nodes in the cluster and is responsible for the actual orchestration of containers on the worker nodes. When we install k8s on a system we are installing:SchedulerControllerContainer RuntimekubeletetcdApi ServerThe API server acts like the frontend for k8s, the users, cli, management devices all use the api server to communicate with the cluster Api Sever: is a restful based infrastructure, we can secure each connectionits the main tool that clusters and nodes communicateit implement interface so different tools and libraries will be able to communicate effectivelyinteract with the worker nodes and provides them with the required informationetcd: this is a tool that allow for the configuration and information and management of nodes within a clusterdistributed reliable key/value storeStore configuration information, which is required by the nodes in the cluster.It has a key/value format.The only way to access it is via the API serverScheduler: Manages the schedules of activity within the actual clusterit is a key component in the k8s clusterit is responsible on distributing workload from the master nodeit tracks the utilisation of workload on cluster nodes and then places the workload on the available resources, it looks for newly created containers and assign them to nodeController Manager: is a daemon server the brain behind the orchestrationit runs in continuous loops and gather information and then its responsible to provide this information to the API serverthey are responsible for noticing and responding when nodes/containers/endpoints fails.The controller will make the decision to create new containers to replace the failed onesthis component is responsible on changing the desired state of a component to the desired stateContainer runtime: is the software used to run containerskubelet: is the agent which run on each node in the cluster, the agent is responsible to make sure the nodes are running as expectedkubectl is the k8s cli which is used to deploy and manage application on k8s cluster. get cluster information and get the status of the nodekubectl run hello-minikube // run a k8s clusterkubectl cluster-infokubectl get nodes // list all of the nodes parts of the clusterk8s Node is formed of:Docker:it runs and manage the container that runs inside the nodeit is the container run timeKubelet:it is responsible of information sharing between the node and the master API Serviceit interact with etcd to read the configurations and the keysk8s proxyit runs the k8s services inside the node, it helps making the service available to the external host. As well forwarding requests to the assigned containersit perform primitive load balancing, it also manages pods on nodes, volumes, secrets, creation of new containers, and health checkups.Thank you for reading.
I?ve been removing a couple of dead features this week. You know, those features that senior people in organisations like to tell epic war stories about. Those mighty conversations at dinner parties, where a person involved talks about all the pain and sorrow, about how a particular capability ended up in the software, how (crappy) […]The post The Burden of Features in Software appeared first on ElegantCode.
The Manifesto for Agile Software Development is pretty long in the tooth. That transformative document was created in 2001, 18 years old at the time of this writing. The resulting humble web page ignited a conversation which ultimately changed the way software is written and delivered. Agile techniques have leaked successfully into other types of […]The post Traditional Agile is now missing the point appeared first on ElegantCode.
When learning a new database API, I typically create a small script to exercises the simple CRUD operations to explore how the API is formed, its logical usage, etc. Sometimes I’ll add a few more operations exercising more complex operations. The new JavaScript SDK for Cosmo DB has been up on GitHub about 4 months now, […]The post Exploring the new Cosmos DB async JavaScript API appeared first on ElegantCode.
I ve had some great opportunities working with teams applying lean and agile techniques to domains beyond software development. Along with other areas, I m having lots of conversations around applying Scrum in HR teams. This is fairly unique because the core usage model for Scrum is to apply it when the work is fairly unknown and […]The post Agile in non-software development teams appeared first on ElegantCode.
How Key Vault is used to secure the Healthcare AI Blueprint
System security is a top priority for any healthcare organization. There are many types of security including physical, network, application, email and so on. This article covers the system security provided by Azure Key Vault. Specifically, we examine the Key Vault implementation used in the Azure Healthcare blueprint. The intent is to demonstrate how a Key Vault […]The post How Key Vault is used to secure the Healthcare AI Blueprint appeared first on ElegantCode.
The open source Git project just released Git 2.31 with features and bug fixes from 85 contributors, 23 of them new. Last time we caught up with you, Git 2.29 had just been released. Two
How MLOps can drive governance for machine learning: A conversation with Algorithmia
This post features a guest interview with Diego M. Oppenheimer, CEO at Algorithmia Over the past few years, machine learning has grown in adoption within the enterprise. More organizations are realizing the importance of machine
How machine learning powers Facebook s News Feed ranking algorithm
Designing a personalized ranking system for more than 2 billion people (all with different interests) and a plethora of content to select from presents significant, complex challenges. This is something we tackle every day with News Feed ranking. Without machine learning (ML), people s News Feeds could be flooded with content they don t find as relevant [...]Read More...The post How machine learning powers Facebook s News Feed ranking algorithm appeared first on Facebook Engineering.
Find and Replace in Word A Microsoft Office Tutorial
Let's say you've written a long article and you're about ready to publish it. But then one of your proof readers lets you know you've spelled a certain word incorrectly, or made some other mistake. And you've done it multiple times, all through that long article. What do you do?
Bashrc Customization Guide How to Add Aliases, Use Functions, and More
Customizing your .bashrc file can greatly improve your workflow and increase your productivity. The .bashrc is a standard file located in your Linux home directory. In this article I will show you useful .bashrc options, aliases, functions, and more. The main benefits of configuring the .bashrc file are: Adding aliases
What is Git? A Beginner's Guide to Git Version Control
Git is a version control system that developers use all over the world. It helps you track different versions of your code and collaborate with other developers. If you are working on a project over time, you may want to keep track of which changes were made, by whom, and
How to Deploy a React App to Production Using Docker and NGINX with API Proxies
This post will help you to learn how to deploy your React applications to production. We are going to use Docker and NGINX to secure API keys and proxy requests to prevent Cross-Origin Resource Sharing (CORS) violations. You can find the code and video in the summary at the end.
Why You Should Use React Components Instead of HTML
HTML is the language of the web, but creating entire websites with HTML alone can be repetitive and hard to manage. In this article, we're going to see how to use the JavaScript library React as a way to add convenience and reusability to our websites. React is a powerful
In this short and practical article, we will talk about how to handle UI events in Jetpack Compose. In the old system, we used OnClickListeners and other interfaces. In Compose, we can take full advantage of Kotlin s Sealed Classes, Function Types and Lambda Expressions. If you do not know
How to Build an Accordion Menu in React from Scratch No External Libraries Required
There are many ways to use accordion menus, like displaying a list of FAQs, showing various menus and submenus, displaying the locations of a particular company, and so on. In this article, we'll see how to build an accordion menu in React completely from scratch, step-by-step, without using any external
Excel Classes Online 11 Free Excel Training Courses
Spreadsheet software like Microsoft Excel is used by everyone from office workers to data scientists. Excel gives a lot of power to people working with data, but it can also be intimidating. We've released a full course on the freeCodeCamp.org YouTube channel that will teach you how to use
How to Build a Weather Application with React and React Hooks
React is a super-awesome front-end library that you can use to build user interfaces. One of the best things about React is that the components we create are encapsulated. In other words, they can't be seen. Let's learn more about how all this works by building a weather application using
End-to-End Integration Testing with NServiceBus: How It Works
In my last post, I walked through setting up end-to-end integration testing with NServiceBus, and how we can use it to black box test message endpoints similar to how the ASP.NET Core integration testing works. In this post, I want to walk through how it all works underneath the
Post by Andy Gordon and Simon Peyton Jones on LAMBDA giving Excel users the ability to define functions.Ever since it was released in the 1980s, Microsoft Excel has changed how people organize, analyze, and visualize their data, providing a basis for decision-making for the millions of people who use it each day. It s also the world s most widely used programming language. Excel formulas are written by an order of magnitude more users than all the C, C++, C#, Java, and Python programmers in the world combined. Despite its success, considered as a programming language Excel has fundamental weaknesses. Over the years, two particular shortcomings have stood out: (1) the Excel formula language really only supported scalar values numbers, strings, and Booleans and (2) it didn t let users define new functions.Until now.
Applications of Blockchain to Programming Language Theory
Let's talk about Blockchain. Goal is to use this forum topic to highlight its usefulness to programming language theory and practice. If you're familiar with existing research efforts, please share them here. In addition, feel free to generate ideas for how Blockchain could improve languages and developer productivity.As one tasty example: Blockchain helps to formalize thinking about mutual knowledge and common knowledge, and potentially think about sharing intergalactic computing power through vast distributed computing fabrics. If we can design contracts in such a way that maximizes the usage of mutual knowledge while minimizing common knowledge to situations where you have to "prove your collateral", third-party transactions could eliminate a lot of back office burden. But, there might be benefits in other areas of computer science from such research, as well.Some language researchers, like Mark S. Miller, have always dreamed of Agoric and the Decades-Long Quest for Secure Smart Contracts.Some may also be aware that verification of smart contracts is an important research area, because of the notorious theft of purse via logic bug in an Ethereum smart contract.
Applied Category Theory - The Emerging Science of Compositionality
An enjoyable 25-minute introductory talk: YOW! Lambda Jam 2019 - Ken Scambler - Applied Category Theory (slides)What do programming, quantum physics, chemistry, neuroscience, systems biology, natural language parsing, causality, network theory, game theory, dynamical systems and database theory have in common?As functional programmers, we know how useful category theory can be for our work - or perhaps how abstruse and distant it can seem. What is less well known is that applying category theory to the real world is an exciting field of study that has really taken off in just the last few years. It turns out that we share something big with other fields and industries - we want to make big things out of little things without everything going to hell! The key is compositionality, the central idea of category theory.Previously: Seven Sketches in Compositionality: An Invitation to Applied Category Theory.(via Brian McKenna)
PyCharm: Meet the PyCharm Team at Python Web Conference Next Week
Pssst. We know a place where 48 international experts are gathering together to talk about Django, Flask, Pyramid, containers, REST APIs, and other topics that every web developer will want to know more about. It s the Python Web Conference, which is taking place next week starting on March 22.The PyCharm team will be there attending sessions, chatting with the audience, and waiting for your questions and feedback in our dedicated Slack channel. Don t be shy come by and chat with us!Paul Everitt, Developer Advocate for PyCharm, will be giving a talk entitled Static Sites With Sphinx and Markdown on Wednesday March 24 at 11:00 am EDT // 4:00 pm CET. This talk introduces Sphinx for websites, demonstrates how to enable MyST for Markdown, and compares what they have to offer versus other approaches.In addition to the talk, Paul will be hosting the Quiz Bowl together with several guests. Want to take part? Join our conference Slack channel or follow us on Twitter for more details And keep an eye out for the conference-welcome video we did something new this year!By the way, you still have a chance to get a 15% discount from JetBrains when you register for the event. Just use this code at checkout: JetBrains15.We hope to see you next week!
Stack Abuse: Validating and Formatting Phone Numbers in Python with phonenumbers
IntroductionValidating phone numbers can be a very challenging task. The format of a phone number can vary from one country to another. Heck, it can also vary within the same country! Some countries share the same country code, while some other countries use more than one country code. According to an example from the Google's libphonenumber GitHub repository, USA, Canada, and Caribbean islands, all share the same country code (+1). On the other hand, it is possible to call the phone numbers from Kosovo by Serbian, Slovenian and Moroccan country codes.These are only a few of the challenges in identifying or validating phone numbers. At first glance, one can at least validate the country code of a phone number with a RegEx. However, this means that you would have to write a custom RegEx rule for every country in the world, just to validate a country code. On top of that, some mobile phone carriers have their own rules (for example, certain digits can only use a certain range of numbers). You can see that things can quickly get out of hand and make it almost impossible for us to validate phone number inputs by ourselves.Luckily, there is a Python library that can help us to get through the validation process easily and efficiently. The Python Phonenumbers library is derived from Google s libphonenumber library, which is also available for other programming languages like C++, Java, and JavaScript.In this tutorial, we ll learn how to parse, validate and extract phone numbers, as well as how to extract additional information from the phone number(s) like the carrier, timezone, or geocoder details.Using the library is very straight-forward and it's typically used like this:import phonenumbersfrom phonenumbers import carrier, timezone, geocodermy_number = phonenumbers.parse("+447986123456", "GB")print(phonenumbers.is_valid_number(my_number))print(carrier.name_for_number(my_number, "en"))print(timezone.time_zones_for_number(my_number))print(geocoder.description_for_number(my_number, 'en'))And here's the output:TrueEE('Europe/Guernsey', 'Europe/Isle_of_Man', 'Europe/Jersey', 'Europe/London')United KingdomLet s get started by setting up our environment and installing the library.Installing phonenumbersFirst, let's create and activate our virtual environment:$ mkdir phonenumbers && cd phonenumbers$ python3 -m venv venv$ . venv/bin/active # venv\Scripts\activate.bat on WindowsThen we install the Python Phonenumbers library:$ pip3 install PhonenumbersThis tutorial will use Phonenumbers library version of 8.12.19.Now we are ready to start discovering the Phonenumbers library.Parse Phone Numbers with Python phonenumbersWhether you get user input from a web form or other sources, like extracting from some text (more on that later in this tutorial), the input phone number will most likely be a string. As a first step, we ll need to parse it using phonenumbers, and turn it into a PhoneNumber instance so that we can use it for validation and other functionalities.We can parse the phone number using the parse() method:import phonenumbersmy_string_number = "+40721234567"my_number = phonenumbers.parse(my_string_number)The phonenumbers.parse() method takes a phone number string as a required argument. You can also pass the country information in ISO Alpha-2 format as an optional argument. Take, for example, the following code into consideration:my_number = phonenumbers.parse(my_string_number, "RO")"RO" stands for Romania in ISO Alpha-2 format. You can check other Alpha-2 and numeric country codes from this website. In this tutorial, for simplicity, I will omit the ISO Alpha-2 country code for most cases and include it only when it's strictly necessary.The phonenumbers.parse() method already has some built-in basic validation rules like the length of a number string, or checking a leading zero, or for a + sign. Note that this method will throw an exception when any of the needed rules are not fulfilled. So remember to use it in a try/catch block in your application.Now that we got our phone number parsed correctly, let's proceed to validation.Validate Phone Numbers with Python PhonenumbersPhonenumbers has two methods to check the validity of a phone number. The main difference between these methods is the speed and accuracy.To elaborate, let's start with is_possible_number():import phonenumbersmy_string_number = "+40021234567"my_number = phonenumbers.parse(my_string_number)print(phonenumbers.is_possible_number(my_number))And the output would be:TrueNow let's use the same number, but with the is_valid_number() method this time:import phonenumbersmy_string_number = "+40021234567"my_number = phonenumbers.parse(my_string_number)print(phonenumbers.is_valid_number(my_number))Even though the input was the same, the result would be different:FalseThe reason is that the is_possible_number() method makes a quick guess on the phone number's validity by checking the length of the parsed number, while the is_valid_number() method runs a full validation by checking the length, phone number prefix, and region.When iterating over a large list of phone numbers, using phonenumbers.is_possible_number() would provide faster results comparing to the phonenumbers.is_valid_number(). But as we see here, these results may not always be accurate. It can be useful to quickly eliminate phone numbers that do not comply with the length. So use it at your own risk.Extract and Format Phone Numbers with Python PhonenumbersUser input is not the only way to get or collect phone numbers. For instance, you may have a spider/crawler that would read certain pages from a website or a document and would extract the phone numbers from the text blocks. It sounds like a challenging problem but luckily, the Phonenumbers library provides us just the functionality we need, with the PhoneNumberMatcher(text, region) method.PhoneNumberMatcher takes a text block and a region as an argument then iterates over to return the matching results as PhoneNumberMatch objects.Let's use PhoneNumberMatcher with a random text:import phonenumberstext_block = "Our services will cost about 2,200 USD and we will deliver the product by the 10.10.2021. For more information, you can call us at +44 7986 123456 or send an e-mail to demo@example.com"for match in phonenumbers.PhoneNumberMatcher(text_block, "GB"): print(match)This will print the matching phone numbers along with their index in the string:PhoneNumberMatch [131,146) +44 7986 123456You may have noticed that our number is formatted in the standardized international format and divided by the spaces. This may not always be the case in real-life scenarios. You may receive your number in other formats, like divided by dashes or formatted to the national (instead of the international) format.Let's put the PhoneNumberMatcher() method to the test with other phone number formats:import phonenumberstext_block = "Our services will cost about 2,200 USD and we will deliver the product by the 10.10.2021. For more information you can call us at +44-7986-123456 or 020 8366 1177 send an e-mail to demo@example.com"for match in phonenumbers.PhoneNumberMatcher(text_block, "GB"): print(match)This would output:PhoneNumberMatch [130,145) +44-7986-123456PhoneNumberMatch [149,162) 020 8366 1177Even though the phone numbers are embedded deep into the text with a variety of formats with other numbers, PhoneNumberMatcher successfully returns the phone numbers with great accuracy.Apart from extracting data from the text, we might also want to get the digits one by one from the user. Imagine that your app's UI works similar to modern mobile phones, and formats the phone numbers as you type in. For instance, on your web page, you might want to pass the data to your API with each onkeyup event and use AsYouTypeFormatter() to format the phone number with each incoming digit.Since UI part is out of the scope of this article, we'll use a basic example for AsYouTypeFormatter. To simulate on-the-fly formatting, let's jump into the Python interpreter:>>> import phonenumbers>>> formatter = phonenumbers.AsYouTypeFormatter("TR")>>> formatter.input_digit("3")'3'>>> formatter.input_digit("9")'39'>>> formatter.input_digit("2")'392'>>> formatter.input_digit("2")'392 2'>>> formatter.input_digit("2")'392 22'>>> formatter.input_digit("1")'392 221'>>> formatter.input_digit("2")'392 221 2'>>> formatter.input_digit("3")'392 221 23'>>> formatter.input_digit("4")'392 221 23 4'>>> formatter.input_digit("5")'392 221 23 45'Not all user input happens as they type. Some forms have simple text input fields for phone numbers. However, that doesn't necessarily mean that we'll have data entered in a standard format.The Phonenumbers library got us covered here too with the format_number() method. This method allows us to format phone numbers into three well-known, standardized formats. National, International, and E164. National and International formats are pretty self-explanatory, while the E164 format is an international phone number format that ensures phone numbers are limited with 15 digits and are formatted {+}{country code}{number with area code}. For more information on E164, you can check this Wikipedia page.Let's start with the national formatting:import phonenumbersmy_number = phonenumbers.parse("+40721234567")national_f = phonenumbers.format_number(my_number, phonenumbers.PhoneNumberFormat.NATIONAL)print(national_f)This will return a nicely spaced phone number string with the national format:0721 234 567Now let's try to format the national number as in international format:import phonenumbersmy_number = phonenumbers.parse("0721234567", "RO") # "RO" is ISO Alpha-2 code for Romaniainternational_f = phonenumbers.format_number(my_number, phonenumbers.PhoneNumberFormat.INTERNATIONAL)print(international_f)The above code will return a nicely spaced phone number string:+40 721 234 567Notice that we passed "RO" as the second parameter into the parse() method. Since the input number is a national number, it has no country code prefix to hint at the country. In these cases, we need to specify the country with its ISO Alpha-2 code to get an accurate result. Excluding either the numeric and ISO Alpha-2 country codes, will cause an exception of NumberParseException: (0) Missing or invalid default region..Now let's try the E164 formatting option. We'll pass a national string as the input:import phonenumbersmy_number = phonenumbers.parse("0721234567", "RO")e164_f=phonenumbers.format_number(my_number, phonenumbers.PhoneNumberFormat.E164)print(e164_f)The output will be very similar to the PhoneNumberFormat.INTERNATIONAL, except with the spaces:+40721234567This is very useful when you want to pass the number to a background API. It isn't uncommon for APIs to expect phone numbers to be non-spaced strings.Get Additional Information on Phone NumberA phone number is loaded with data about a user that could be of interest to you. You may want to use different APIs or API endpoints depending on the carrier of the particular phone number since this plays a role in the product cost. You might want to send your promotion notifications depending on your customer's (phone number's) timezone so that you don't send them a message in the middle of the night. Or you might want to get information about the phone number's location so that you can provide relevant information. The Phonenumbers library provides the necessary tools to fulfill these needs.To start with the location, we will use the description_for_number() method from the geocoder class. This method takes in a parsed phone number and a short language name as parameters.Let's try this with our previous fake number:import phonenumbersfrom phonenumbers import geocodermy_number = phonenumbers.parse("+447986123456")print(geocoder.description_for_number(my_number, "en"))This will print out the origin country of the phone number:United KingdomShort language names are pretty intuitive. Let's try to get output in Russian:import phonenumbersfrom phonenumbers import geocodermy_number = phonenumbers.parse("+447986123456")print(geocoder.description_for_number(my_number, "ru"))And here's the output which says the United Kingdom in Russian: You can try it out with other languages of your preferences like "de", "fr", "zh", etc.As mentioned before, you might want to group your phone numbers by their carriers, since in most cases it will have an impact on the cost. To clarify, the Phonenumbers library probably will provide most of the carrier names accurately, but not 100%.Today in most countries it is possible to get your number from one carrier and later on move the same number to a different carrier, leaving the phone number exactly the same. Since Phonenumbers is merely an offline Python library, it is not possible to detect these changes. So it's best to approach the carrier names as a reference, rather than a fact.We will use the name_for_number() method from carrier class:import phonenumbersfrom phonenumbers import carriermy_number = phonenumbers.parse("+40721234567")print(carrier.name_for_number(my_number, "en"))This will display the original carrier of the phone number if possible:VodafoneNote: As it is mentioned in the original documents of the Python Phonenumbers, carrier information is available for mobile numbers in some countries, not all.Another important piece of information about a phone number is its timezone. The time_zones_for_number() method will return a list of timezones that the number belongs to. We'll import it from phonenumbers.timezone :import phonenumbersfrom phonenumbers import timezonemy_number = phonenumbers.parse("+447986123456")print(timezone.time_zones_for_number(my_number))This will print the following timezones:('Europe/Guernsey', 'Europe/Isle_of_Man', 'Europe/Jersey', 'Europe/London')This concludes our tutorial on Python Phonenumbers.ConclusionWe learned how to parse phone numbers with parse() method, extract numbers from text blocks with PhoneNumberMatcher(), get the phone numbers digit by digit and format it with AsYouTypeFormatter(), use different validation methods with is_possible_number() and is_possible_number(), format numbers using NATIONAL, INTERNATIONAL, and E164 formatting methods, and extract additional information from the phone numbers using geocoder, carrier, and timezone classes.Remember to check out the original GitHub repo of the Phonenumbers library. Also if you have any questions in mind, feel free to comment below.
Python for Beginners: How to Join Strings in Python 3
Programmers are destined to work with plenty of string data. This is partially because computer languages are tied to human language, we use one to create the other, and vice versa. For this reason, it s a good idea to master the ins and outs of working with strings early on. In Python, this includes learning how to join strings.Manipulating strings may seem daunting, but the Python language includes tools that make this complex task easier. Before diving into Python s toolset, let s take a moment to examine the properties of strings in Python.A Little String TheoryAs you may recall, in Python, strings are an array of character data.An important point about strings is that they are immutable in the Python language. This means that once a Python string is created, it cannot be changed. Changing the string would require creating an entirely new string, or overwriting the old one. We can verify this feature of Python by creating a new string variable. If we try to change a character in the string, Python will give us a Traceback Error.>>> my_string = "Python For Beginners">>> my_string[0]'P'>>> my_string[0] = 'p'Traceback (most recent call last): File "<stdin>", line 1, in <module>TypeError: 'str' object does not support item assignmentIts a good idea to keep the immutable quality of strings in mind when writing Python code. While you can t change strings in Python, you can join them, or append them. Python comes with many tools to make working with strings easier.In this lesson, we ll cover various methods for joining strings, including string concatenation. When it comes to joining strings, we can make use of Python operators as well as built-in methods. As students progress, they re likely to make use of each of these techniques in one way or another. Each has their own purpose.Joining Strings with the + OperatorConcatenation is the act of joining two or more strings to create a single, new string.In Python, strings can be concatenated using the + operator. Similar to a math equation, this way of joining strings is straight-forword, allowing many strings to be added together. Let s take a look at some examples:# joining strings with the '+' operatorfirst_name = "Bilbo"last_name = "Baggins"# join the names, separated by a spacefull_name = first_name + " " + last_nameprint("Hello, " + full_name + ".")In our first example, we created two strings, first_name and last_name, then joined them using the + operator. For clarity, we added space between the names.Running the file, we see the following text in the Command Prompt:Hello, Bilbo Baggins.The print statement at the end of the example shows how joining strings can generate text that is more legible. By adding punctuation through the power of concatenation, we can create Python programs that are easier to understand, easier to update, and more likely to be used by others.Let s look at another example. This time we ll make use of a for loop to join our string data.# some characters from Lord of the Ringscharacters = ["Frodo", "Gandalf", "Sam", "Aragorn", "Eowyn"]storyline = ""# loop through each character and add them to the storylinefor i in range(len(characters)): # include "and" before the last character in the list if i == len(characters)-1: storyline += "and " + characters[i] else: storyline += characters[i] + ", "storyline += " are on their way to Mordor to destroy the ring."print(storyline)This more advanced example shows how concatenation can be used to generate human readable text from a Python list. Using a for loop, the list of characters (taken from the Lord of the Rings novels) is, one-by-one, joined to the storyline string. A conditional statement was included inside this loop to check whether or not we ve reached the last object in the list of characters. If we have, an additional and is included so that the final text is more legible. We re also sure to include our oxford commas for additional legibility.Here s the final output:Frodo, Gandalf, Sam, Aragorn, and Eowyn, are on their way to Mordor to destroy the ring.This method will NOT work unless both objects are stings. For instance, trying to join a string with a number will produce an error.>>> string = "one" + 2Traceback (most recent call last): File "<stdin>", line 1, in <module>TypeError: can only concatenate str (not "int") to strAs you can see, Python will only let us concatenate a string to another string. It s our jobs as programmers to understand the limitations of the languages we re working with. In Python, we ll need to make sure we re concatenating the correct types of objects if we wish to avoid any errors.Joining Lists with the ‘+’ OperatorThe + operator can also be used to join one or more lists of string data. For instance, if we had three lists, each with their own unique string, we could use the + operator to create a new list combining elements from all three.hobbits = ["Frodo", "Sam"]elves = ["Legolas"]humans = ["Aragorn"]print(hobbits + elves + humans)As you can see, the + operator has many uses. With it, Python programmers can easily combine string data, and lists of strings.Joining Strings with the .join() MethodIf you re dealing with an iterable object in Python, chances are you ll want to use the .join() method. An iterable object, such as a string or a list, can be easily concatenated using the .join() method. Any Python iterable, or sequence, can be joined using the .join() method. This includes, lists and dictionaries. The .join() method is a string instance method. The syntax is as follows:string_name.join(iterable)Here s an example of using the .join() method to concatenate a list of strings:numbers = ["one", "two", "three", "four", "five"]print(','.join(numbers))Running the program in the command prompt, we ll see the following output:one,two,three,four,fiveThe .join() method will return a new string that includes all the elements in the iterable, joined by a separator. In the previous example, the separator was a comma, but any string can be used to join the data.numbers = ["one", "two", "three", "four", "five"]print(' and '.join(numbers))We can also use this method to join a list of alphanumeric data, using an empty string as the separator.title = ['L','o','r','d',' ','o','f',' ','t','h','e',' ','R','i','n','g','s']print( .join(title))The .join() method can also be used to get a string with the contents of a dictionary. When using .join() this way, the method will only return the keys in the dictionary, and not their values.number_dictionary = {"one":1, "two":2, "three":3,"four":4,"five":5}print(', '.join(number_dictionary))When joining sequences with the .join() method, the result will be a string with elements from both sequences.Copying Strings with * OperatorIf you need to join two or more identical strings, it s possible to use the * operator. With the * operator, you can repeat a string any number of times.fruit = apple print(fruit * 2)The * operator can be combined with the + operator to concatenate strings. Combining these methods allows us to take advantage of Python s many advanced features.fruit1 = "apple"fruit2 = "orange"fruit1 += " "fruit2 += " "print(fruit1 * 2 + " " + fruit2 * 3)Splitting and Rejoining StringsBecause strings in Python are immutable, it s quite common to split and rejoin them. The .split() method is another string instance method. That means we can call it off the end of any string object. Like the .join() method, the .split() method uses a separator to parse the string data. By default, whitespace is used as the separator for this method.Let s take a look at the .split() method in action.names = "Frodo Sam Gandalf Aragorn"print(names.split())This code outputs a list of strings.['Frodo', 'Sam', 'Gandalf', 'Aragorn']Using another example, we ll see how to split a sentence into its individual parts.story = "Frodo took the ring of power to the mountain of doom."words = story.split()print(words)Using the .split() method returns a new iterable object. Because the object is iterable, we can use the .join() method we learned about earlier to glue the strings back together.original = "Frodo took the ring of power to the mountain of doom."words = original.split()remake = ' '.join(words)print(remake)By using string methods in Python, we can easily split and join strings. These methods are crucial to working with strings and iterable objects.Tying Up the Loose EndsBy now, you should have a deeper knowledge of strings and how to use them in Python 3. Working through the examples provided in this tutorial will be a great start on your journey to mastering Python.No student, however, can succeed alone. That s why we ve compiled a list of additional resources provided by Python For Beginners to help you complete your training.Learn how to create a Python Dictionary.A beginner s guide to Python List Comprehension.With help, and patience, anyone can learn the basics of Python. If working to join strings in Python seems daunting, take some time to practice the examples above. By familiarizing yourself with string variables, and working with methods, you ll quickly tap into the unlimited potential of the Python programming language.The post How to Join Strings in Python 3 appeared first on PythonForBeginners.com.
This release is dedicated to fixing bugs and enhancing performance. We are also working on implementing the concept of trusted projects, which is designed to mitigate the risks associated with opening projects from unknown and untrusted sources.You can upgrade to v2020.3.4 with the Toolbox App, or right from the IDE, or by using snap if you are an Ubuntu user. It is also available for download from our website.Trusted ProjectsThe simple act of opening a project in the IDE can lead to the automatic execution of code from the project s virtual environment, specifically its activation script. This can pose a significant risk if a malicious actor creates the project. Unfortunately, the risk is not merely hypothetical. There have been recent attempts to attack security researchers by sending them Visual Studio projects containing malicious code.We ve introduced the concept of Trusted Projects to mitigate these risks. When you open an imported or cloned project that contains a virtual environment, PyCharm doesn t execute the auto-configuration of the virtual environment. Instead, it first checks whether the project is from a trusted location. If the project folder is not listed as a trusted location, PyCharm won t proceed with the auto-configuration of its interpreter. Instead, PyCharm will let you decide whether to use the project s interpreter or configure another Python interpreter instead.PyCharm makes it possible to identify trusted locations in Preferences/Settings | Build, Execution, Deployment | Trusted Locations. Projects in directories specified as Trusted Locations are always considered trusted. To ensure that the projects are treated as untrusted only in unusual circumstances, we recommend adding the directory where you usually create projects to your trusted locations.Other notable improvementsApple ARM chip (Apple Silicon): the OS X Keychain is now accessible from your IDE. [IDEA-258912]Code insight: inspections work as expected for decorators defined as classes. [PY-46768]Pytest: failed tests for run configurations with additional arguments can now be rerun without errors. [PY-46006]Markdown: all characters are now rendered correctly in the preview tab. [IDEA-258796]Docker: we ve fixed the issue causing log spamming when disconnecting from Docker. [IDEA-259400]Web development: the Vue.js plugin no longer breaks HTML templating. [PY-46857]You can refer to the release notes for a full list of issues resolved in this version. Update to v2020.3.3 now, and don t forget to share your feedback with us in the comments to this post or post your suggestions to our issue tracker.
RoseHosting Blog: How to Install Anaconda on Ubuntu 20.04
Anaconda is a free, open-source, and one of the most popular distribution of Python and R Programming language. Generally, it ... Read moreHow to Install Anaconda on Ubuntu 20.04The post How to Install Anaconda on Ubuntu 20.04 appeared first on RoseHosting.
Real Python: Python AI: How to Build a Neural Network & Make Predictions
If you re just starting out in the artificial intelligence (AI) world, then Python is a great language to learn since most of the tools are built using it. Deep learning is a technique used to make predictions using data, and it heavily relies on neural networks. Today, you ll learn how to build a neural network from scratch.In a production setting, you would use a deep learning framework like TensorFlow or PyTorch instead of building your own neural network. That said, having some knowledge of how neural networks work is helpful because you can use it to better architect your deep learning models.In this tutorial, you ll learn:What artificial intelligence isHow both machine learning and deep learning play a role in AIHow a neural network functions internallyHow to build a neural network from scratch using PythonLet s get started!Free Bonus: Click here to get access to a free NumPy Resources Guide that points you to the best tutorials, videos, and books for improving your NumPy skills.Artificial Intelligence OverviewIn basic terms, the goal of using AI is to make computers think as humans do. This may seem like something new, but the field was born in the 1950s.Imagine that you need to write a Python program that uses AI to solve a sudoku problem. A way to accomplish that is to write conditional statements and check the constraints to see if you can place a number in each position. Well, this Python script is already an application of AI because you programmed a computer to solve a problem! Machine learning (ML) and deep learning (DL) are also approaches to solving problems. The difference between these techniques and a Python script is that ML and DL use training data instead of hard-coded rules, but all of them can be used to solve problems using AI. In the next sections, you ll learn more about what differentiates these two techniques.Machine LearningMachine learning is a technique in which you train the system to solve a problem instead of explicitly programming the rules. Getting back to the sudoku example in the previous section, to solve the problem using machine learning, you would gather data from solved sudoku games and train a statistical model. Statistical models are mathematically formalized ways to approximate the behavior of a phenomenon. A common machine learning task is supervised learning, in which you have a dataset with inputs and known outputs. The task is to use this dataset to train a model that predicts the correct outputs based on the inputs. The image below presents the workflow to train a model using supervised learning:Workflow to train a machine learning modelThe combination of the training data with the machine learning algorithm creates the model. Then, with this model, you can make predictions for new data.Note: scikit-learn is a popular Python machine learning library that provides many supervised and unsupervised learning algorithms. To learn more about it, check out Split Your Dataset With scikit-learn s train_test_split().The goal of supervised learning tasks is to make predictions for new, unseen data. To do that, you assume that this unseen data follows a probability distribution similar to the distribution of the training dataset. If in the future this distribution changes, then you need to train your model again using the new training dataset.Feature EngineeringPrediction problems become harder when you use different kinds of data as inputs. The sudoku problem is relatively straightforward because you re dealing directly with numbers. What if you want to train a model to predict the sentiment in a sentence? Or what if you have an image, and you want to know whether it depicts a cat? Another name for input data is feature, and feature engineering is the process of extracting features from raw data. When dealing with different kinds of data, you need to figure out ways to represent this data in order to extract meaningful information from it.An example of a feature engineering technique is lemmatization, in which you remove the inflection from words in a sentence. For example, inflected forms of the verb watch, like watches, watching, and watched, would be reduced to their lemma, or base form: watch. If you re using arrays to store each word of a corpus, then by applying lemmatization, you end up with a less-sparse matrix. This can increase the performance of some machine learning algorithms. The following image presents the process of lemmatization and representation using a bag-of-words model:Creating features using a bag-of-words modelFirst, the inflected form of every word is reduced to its lemma. Then, the number of occurrences of that word is computed. The result is an array containing the number of occurrences of every word in the text.Deep LearningRead the full article at https://realpython.com/python-ai-neural-network/ [ Improve Your Python With Python Tricks Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
<p>Sponsored by <strong>Linode!</strong> <a href="https://pythonbytes.fm/linode"><strong>pythonbytes.fm/linode</strong></a>Special guest: <a href="https://twitter.com/SebaWitowski"><strong>Sebastian Witowski</strong></a></p><a href='https://www.youtube.com/watch?v=Omdzzl6XHDE' style='font-weight: bold;'>Watch on YouTube</a><br><br><p><strong>Brian #1:</strong> <strong>Raspberry Pi Pico</strong></p><ul><li><a href="https://www.raspberrypi.org/blog/raspberry-pi-silicon-pico-now-on-sale/">Release Announcement</a></li><li><a href="https://www.tomshardware.com/reviews/raspberry-pi-pico-review">A review</a></li><li>$4 microcontroller</li><li>Small</li><li>Extremely low power needs.</li><li>Built on RP2040, a brand-new chip developed by Raspberry Pi</li><li>Related: <a href="https://codewith.mu/"><strong>Mu : codewith.mu</strong></a><strong>,</strong> <a href="https://mu.readthedocs.io/en/latest/changes.html">1.1.0-beta.2</a><ul><li>Mu is a simple Python editor for beginner programmers. </li><li>1.1.0 support new boards, including Pico, Lego Spike, plus lots of new fixes.</li></ul></li></ul><p><strong>Michael #2:</strong> <a href="https://dev.to/romanright/announcing-beanie-mongodb-odm-56e"><strong>New MongoDB ODM: Beanie</strong></a> </p><ul><li>via PyCoders</li><li>Beanie - is an asynchronous ODM for MongoDB, based on <a href="https://motor.readthedocs.io/en/stable/">Motor</a> and <a href="https://pydantic-docs.helpmanual.io/">Pydantic</a>.</li><li>Very new but also very exciting.</li><li>Main component of Beanie is <a href="https://pydantic-docs.helpmanual.io/">Pydantic</a>. It helps to implement the main feature - data structuring. </li><li>Beanie <code>Document</code> - is an abstraction over the Pydantic <code>BaseModel</code> that allows working with Python objects at the application level and JSON objects at the database level.</li><li>Example, classes:</li></ul><pre><code> class TagColors(str, Enum): RED = "RED" BLUE = "BLUE" GREEN = "GREEN" class Tag(BaseModel): name: str color: TagColors = TagColors.BLUE class Note(Document): # This is the document structure title: str text: Optional[str] tag_list: List[Tag] = []</code></pre><p><strong>Sebastian #3:</strong> <a href="https://sourcery.ai/"><strong>Sourcery</strong></a> </p><ul><li>No, not the Terry Pratchett novel (although this one is pretty cool too!)</li><li>Gives you refactoring recommendations in your code editor</li><li>Integrates with PyCharm and VS Code</li><li>Super easy to use - you get suggestions as you type and with one click you can apply them</li><li>Free to use in the code editor (you will need a personal token) and paid plans with analytics, CI integration, etc.</li><li>It keeps finding errors in my code (well, maybe I'm just a bad programmer )</li></ul><p><strong>Michael #4:</strong> <a href="https://neomodel.readthedocs.io/en/latest/"><strong>Neomodel</strong></a></p><ul><li>An Object Graph Mapper (OGM) for the <a href="https://www.neo4j.org">neo4j</a> graph database, built on the awesome <a href="https://github.com/neo4j/neo4j-python-driver">neo4j_driver</a></li><li>Features:</li><li>Familiar Django model style definitions.</li><li>Powerful query API.</li><li>Enforce your schema through cardinality restrictions.</li><li>Full transaction support.</li><li>Thread safe.</li><li>pre/post save/delete hooks.</li><li>Django integration via <a href="https://github.com/neo4j-contrib/django-neomodel">django_neomodel</a></li><li>Example of classes</li></ul><pre><code> from neomodel import (config, StructuredNode, StringProperty, IntegerProperty, UniqueIdProperty, RelationshipTo) config.DATABASE_URL = 'bolt://neo4j:password@localhost:7687' class Country(StructuredNode): code = StringProperty(unique_index=True, required=True) class Person(StructuredNode): uid = UniqueIdProperty() name = StringProperty(unique_index=True) age = IntegerProperty(index=True, default=0) # traverse outgoing IS_FROM relations, inflate to Country objects country = RelationshipTo(Country, 'IS_FROM')</code></pre><ul><li>Relationships</li></ul><pre><code> germany = Country.nodes.filter(code='DE) jim = Person.nodes.get(name='Jim') jim.country.connect(germany) if jim.country.is_connected(germany): print("Jim's from Germany") for p in germany.inhabitant.all(): print(p.name) # Jim, ... len(germany.inhabitant) # N: int # Find people called 'Jim' in germany germany.inhabitant.search(name='Jim') # Find all the people called in germany except 'Jim' germany.inhabitant.exclude(name='Jim')</code></pre><p><strong>Brian 5#:</strong> <a href="https://blog.thea.codes/my-python-testing-style-guide/#a-mock-must-always-have-a-spec"><strong>A mock must always have a spec</strong></a></p><ul><li>From Stargirl Flowers <a href="https://blog.thea.codes/my-python-testing-style-guide/">My Python testing style guide</a> </li><li>Great guide altogether, but this bit about mock specs is awesome.</li><li>Some mocking guidance:<ul><li>Use real objects for collaborators whenever possible</li><li>A mock must always have a spec with <code>mock.create_autospec()</code> or <code>mock.patch(..., autospec=True)</code>.<ul><li> This ensures there's some connection between your mock and the real collaborator's interface. If you change the collaborator's interface in a way that breaks downstream targets, those targets tests will rightfully fail. </li></ul></li><li>Consider using a stub or fake (with examples)</li><li>Consider a spy (real object + mock wrapper lets you assert called and such)</li><li>Don't give mock/stubs/fakes special names</li><li>Use factory helpers to create complex collaborators</li></ul></li><li>And then some random weird advice:<ul><li> Use fixtures sparingly - Now them s fighting words. :)</li></ul></li></ul><p><strong>Sebastian #6:</strong> <a href="https://pypi.org/project/conference-radar/"><strong>Conference radar</strong></a></p><ul><li>The PyCon 2021 Call for Proposal acceptance emails will be sent soon, so let's talk about conferences.</li><li>It's 2021, and just like the last year, most conferences are moving to an online format.</li><li>Which is great, because it's so much easier to attend them. Not only the tickets got cheaper or even free, but you also don't have to pay for the accommodation, plane tickets and you don't have to actually fly anywhere.</li><li>But how do you find what are the upcoming conferences? There is a list of conferences at python.org, but it doesn't have the smaller, local events, and you don't immediately see when each conference is taking place.</li><li>I've found a tool called "conference radar" - a Python package that gives you a CLI tool to check for upcoming Python conferences!</li><li>It prints a nice ASCII table with the dates of each conference. There are even some options that you can pass, for example, to see which conferences have an open Call for Proposal, in case you want to submit something. </li><li>The main downside is that plenty of conferences are not included there, but I hope that the list of sources will be expanded in the future. The CFP flag is also not working very well, I guess, because it's hard to parse the data sources and extract this information automatically.</li><li>So far, my best way to stay on top of the open CFPs is to follow my friend Miroslav on Twitter.</li><li>MK: Heads up on installation. They say you can <code>pip install conrad</code> but the actual command is <code>pip install conference-radar</code></li></ul><p><strong>Extras</strong></p><p>Michael</p><ul><li>Announcing Modern Python Projects course: <a href="https://talkpython.fm/modern-python-projects"><strong>talkpython.fm/modern-python-projects</strong></a></li><li>Now highlighting <strong>live</strong> livestreams on Python Bytes: <a href="https://pythonbytes.fm/stream/live"><strong>pythonbytes.fm/stream/live</strong></a></li><li>Mars again. Yes, <strong>Python IS on Mars</strong>. See <a href="https://twitter.com/tjmcgrew/status/1370009196167626752">tweet</a>.</li><li>Signups for the Python Language Summit at PyCon (online only) <a href="https://twitter.com/gvanrossum/status/1371563816786358274?cn=ZmxleGlibGVfcmVjcw%3D%3D&amp;refsrc=email"><strong>are now open</strong></a>.</li></ul><p>Sebastian</p><ul><li>I've started using VS Code in the browser (a new project for a new client), and it's surprisingly good! I would never try it myself (I love to have all my tools installed locally on my computer). The worst part? Learning to not click "Ctrl+W" when I want to close a tab in VS Code (as it closes the whole tab in my browser). I'm curious to see if it will ever become a standard in programming. It's definitely a great way to set up a standardized development platform for the whole team.</li></ul><p><strong>Joke</strong>: <strong><a href="https://trello-attachments.s3.amazonaws.com/6041d3e66056524cad8fd110/460x936/02ad871763c0031a1fab0ef4c57be88f/Screen_Shot_2021-03-04_at_10.35.54_PM.png">He has WiFi</a></strong></p>
Some of you read my previous post on typing.Protocols andprobably wondered: what about zope.interface? I ve advocated strongly for itin the past but now that we have Mypy and Protocols, is it simply a relic of an earliertime? Can we entirely replace it with Protocol?Let s have a look.Typing in 2 dimensionsIn the previous post I discussed structural versus nominal typing. In Mypy stype system, most classes are checked nominally whereas Protocol is checkedstructurally. However, there s another way that Protocol is distinct from anormal class: normal classes are concrete types, and Protocols areabstract.Abstract types:cannot be instantiated: every instance of an abstract type is an instance of some concrete sub-type, anddo not include (complete) implementation logic.Concrete types:can be instantiated: they are complete descriptions of a type, andmust include all their own implementation logic.Protocols and Interfaces are both abstract, but Interfaces are nominal.The highest level distinction between the two is that when you have a problemthat requires an abstract type, but nominal checking is preferable tostructural, Interfaces are a better solution.Python s built-in Abstract BaseClasses aretechnically abstract-and-nominal as well, but they re in a strange halfwayspace; they re formally abstract because they can t be instantiated, butthey re partially concrete in that they can contain any amount ofimplementation logic themselves, and thereby making an object which is asubtype of multiple ABCs drags in all the usual problems of the conflictingnamespaces within multiple inheritance.Theoretically, there s a way to treat ABCs as purely abstract which is to useABCMeta.register but as of this writing (March 2021) it doesn t work withMypy, so within the context of static typing in Python we presently have to ignore it.PracticalitiesThe first major advantage that Protocol has is that since it is now built into Python itself, there s no reason not to use it. When Protocol didn teven exist, regardless of all the advantages of adding explicit abstract typesto your project with zope.interface, it did still have the small down-side ofrequiring a new dependency, with all the minor headaches that might imply.beyond the theoretical distinctions, there s a question of how well toolingsupports zope.interface. There are some clear gaps; there is not a ton ofgreat built-in IDE support for zope.interface; less-sophisticated linterswill sometimes still complain that Interfaces don t take self as theirfirst argument. Indeed, Mypy itself does this by default although more onthat in a moment. Less mainstream performance-focused type-checkers likePyre andPyright don t support zope.interface,either, although their lack of support for zope.interface is just a part of abroader problem of their lack of extensibility; they also can t supportSQLAlchemy or the Django ORM without special-casing in the tools themselves.But what about Mypy itself if we have to discount ABCMeta.register due topractical tooling deficiencies even if they provide a built-in way to declare anominal-but-abstract type in principle, we need to be able to usezope.interface within Mypy as well for a fair comparison with Protocol.Can we?Luckily, yes! Thanks to Shoobx, there s a fairly actively maintained Mypyplugin that supportszope.interfacewhich you can use to statically check your Interfaces.However, this plugin does have a few keylimitations as of this writing(Again, March 2021), which makes its safety guarantees a bit lower-qualitythan Protocol.The net result of this is that Protocols have the home-field advantage inmost cases; out of the box, they ll work more smoothly with your existingeditor / linter setup, and as long as your project supports Python 3.6+, atworst (if you can t use Python 3.7, where Protocol is built in to typing)you have to take a type-check-time dependency on the typing_extensionspackage, whereas with zope.interface you ll need both the run-time dependencyof zope.interface itself and the Mypy plugin at type-checking time.So in a situation where both are roughly equivalent, Protocol tends to win bydefault. There are undeniably big areas where Interfaces and Protocolsoverlap, and in plenty of them, using Protocol is a fine idea. But there arestill some clear places that zope.interface shines.First, let s look at a case which Interfaces handle more gracefully thanProtocols: opting out of matching a simple shape, where the shape doesn tfully describe its own meaning.Where Interfaces work best: hidden and complex meaningsThe string is a stark data structure and everywhere it is passed there ismuch duplication of process. It is a perfect vehicle for hiding information.Alan Perlis, Epigrams inProgramming ,Epigram 34.The place where structural typing has the biggest advantage is when the typesystem is expressive enough to fully encode the meaning of the desiredbehavior within the structure of the type itself. Consider a Protocol whichdescribes an object that can add some integers together:123class Math(Protocol): def add_integers(addend1: int, addend2: int) -> int: ...It s fairly unambiguous what adherents to this Protocol should do, and anyoneimplementing such a thing should be able to clearly tell that the method issupposed to add a couple of integers together; there s nothing hidden about thestructure of the integers, no constraints the type system won t let us specify.It would be quite surprising if anything that didn t have the intended behaviorwould match this Protocol.A the other end of the spectrum, we might have a plugin Interface that has alot of hidden structure. For this example, we have an Interface calledIPlugin containing a method with an easy-to-conflict-with name ( name )overloaded with very specific constraints on its return type: the string mustcontain the dotted-path name of a Python object in an import-able module (like,for example, "os.path.join").123class IPlugin(Interface): def name() -> str: "Return the fully-qualified Python identifier of the thing to load."With Protocols, you can work around these limitations, by manually makingit harder to match; adding elements to the structure that embed names relevantto its semantics and thereby making the type behave more as if it werenominally typed.You could make the method s name long and ugly instead (plugin_name_to_load,let s say) or add unused additional attributes (yep_i_am_a_plugin =Literal[True]) in order to reduce the risk of accidental matches, but theseworkarounds look hacky, and they have to be manually namespaced; if you want tomark it as having semantics associated with your specific plugin system, youhave to embed the name of that system in your attributes themselves; here we rejust saying plugin but if we want to be truly careful, we have to embed thewhole name of our project in there.With Interfaces, the maintainer of each implementation must explicitly optin, by choosing whether to specify that they are an @implementer(IPlugin).Since they had to import IPlugin from somewhere, this annotation carrieswith it a specific, namespaced declaration of semantic intent: I know whatthe Interface IPlugin means, and I promise that I can provide it .This is the most salient distinction between Protocols and Interfaces: ifyou have strong reasons to want adherents to the abstract type to opt in, youwant an Interface; if you want them to match automatically, you want aProtocol.Runtime supportInterfaces also provide a more nuanced set of runtime checks.You can say that an objectdirectlyProvidesan interface, allowing for some level of (at least runtime) type safety, andask if IPlugin is .providedBy some object.You can do most of this with Protocol, but it s awkward. The@runtime_checkabledecorator allows your Protocol to make isinstance(x, MyProtocol) work likeIMyInterface.providedBy(x), but:you re still missing directlyProvides; the runtime checking is all by type, not by the individual properties of the instance;it s not the default, so if you re not the one defining the Protocol, there s no guarantee you ll be able to use it.With Interfaces, there s also no mandatory relationship between theimplementer (i.e. the type whose instances fit the specified shape) and theprovider (the specific object which can fit the specified shape). This meansyou get features likeclassProvidesandmoduleProvides for free .Interfaces work particularly well for communication between frameworks andapplication code. For example, let s say you re evolving the meaning of anInterface implemented by applications over time EventHandler,EventHandler2, EventHandler3 which have similarly named and typedmethods, but subtly different expectations on their lifecycle or when preciselythe methods will be called. A framework facing this problem can use a seriesof Interfaces, and check at runtime to see which of these the applicationimplements, and be secure in the knowledge that the application has properlyintentionally adopted the new interface, and doesn t just happen to have amatching method name against an older version.Finally, zope.interface gives you adaptation and adapterregistries, whichcan be a useful mechanism for doing things like templating, like a much morepowerful version ofsingledispatchfrom the standard library.Adapter registries are nuanced, complex tools and unfortunately an example thatcaptures the full utility of their power would itself be commensuratelycomplex. However, the core of adaptation is the idea that if you have anarbitrary object x, and you want a provider of the interface IY, you can dothe following:1y = IY(x, None)This performs a multi-stage check:If x already provides IY (either via implementer, provider, directlyProvides, classProvides, or moduleProvides), it s simply returned; so you don t need to special-case the case where you ve already got what you want.If x has a __conform__(interface) method, it ll be called with IY as the interface, and if __conform__ returns anything non-None that result will be returned from the call to IY.If IY has a specially-defined __adapt__ method, it can implement its own logic for this hook directly.Each globally-registered function in zope.interface s adapter_hooks will be invoked to find a function that can transform x into an IY provider. Twisted has its own global registry in this list, which is what registerAdapter manipulates.But from the perspective of the caller, you can just say I want an IY .With Protocols, you can emulate this withfunctools.singledispatchby making a function which returns your Protocol type and registers varioustypes to do conversion. The place that adapter registries have an advantage istheir central nature and consistent idiom for converting to the target type;you can use adaptation for any Interface in the same way, and any type canparticipate in adaptation in the ways listed above via flexible mechanismsdepending on where it makes sense to put your implementation, whereas anysingledispatch function to convert to a Protocol needs to be bespokeper-Protocol.Describing and restricting existing shapesThere are still several scenarios where Protocol s semantics apply morecleanly.Unlike Interfaces, Protocols can describe the types of things that alreadyexist. To see when that s an advantage, consider a sprawling application thatuses tons of libraries and manipulates 3D spatial data points.There s a convention among these disparate libraries where they all represent a point as an object with .x, .y, and .z attributes which are allfloats. This is a natural enough shape, given the domain, that lots of yourlibraries just fit it by accident. You want to write functions that can workwith data output by any of these libraries as long as it plausibly looks likeyour own concept of a Point:1234class Point(Protocol): x: float y: float z: floatIn this case, the thing defining the Protocol is your application; thething implementing the Protocol is your collection of libraries. Since thelibraries don t and can t know about the application the dependency arrowpoints the other way they can t reference the Protocol to note that theyimplement it.Using Protocol, you can also restrict an existing type to preserve futureflexibility.For example, let s say we re implementing a mailbox type pattern, where somesystems deliver messages and other systems retrieve them later. To avoidmix-ups, the system that sends the messages shouldn t retrieve them and viceversa - receivers only receive, and senders only send. With Protocols, wecan describe this without having any new custom concrete types, like so: 1 2 3 4 5 6 7 8 9101112from typing import Protocol, TypeVarT_co = TypeVar("T_co", covariant=True)T_con = TypeVar("T_con", contravariant=True)class Sender(Protocol[T_con]): def add(self, item: T_con) -> None: "Put an item in the slot."class Receiver(Protocol[T_co]): def pop(self) -> T_co: "Retrieve an item from the PO box."All of that code is just telling Mypy our intentions; there s no behavior here yet.The actual implementation is even shorter:123from typing import Setmailbox: Set[int] = set()Literally no code of our own - set already does the job we described. Andhow do we use this? 1 2 3 4 5 6 7 8 91011def send(sender: Sender[int]) -> None: sender.add(3)def receive(receiver: Receiver[int]) -> None: receiver.pop() receiver.add(3) # Mypy stops us making this mistake: # "Receiver[int]" has no attribute "add"send(mailbox)receive(mailbox)For its initial implementation, this system requires nothing beyond typesavailable in the standard library; just a set. However, by treating theirparameter as a Sender and a Receiver respectively rather than a Set,send and receive prevent themselves from using any functionality from theset passed in aside from the one method that their respective roles aresupposed to see . As a result, Mypy will now tell us if any code whichreceives the sender object tries to remove objects.This allows us to use existing data structures in libraries without the usualattendant problem of advertising to all clients that every tiny implementationdetail of those existing structures is an intended part of the publicinterface. Python has always tried to make these sort of distinctions byleaving certain things undocumented or saying narratively which things youshould rely on, but it s always hit-or-miss (usually miss) whether libraryconsumers will see those admonitions or not; by making it a feature of theprogramming environment, Mypy makes it harder to ignore.ConclusionsIn modern Python code, when you have an abstract collection of behavior, youshould probably consider using a Protocol to describe it by default.However, Interface is also staying up to date with modern Python tooling bywith Mypy support, and it can be worthwhile for more sophisticated consumersthat want support for nominal typing, or that want to draw on its reachadaptation and component registration feature-set.
Python for Beginners: How to remove punctuation from a Python String
Often during data analysis tasks, we come across text data which needs to be processed so that useful information can be derived from the data. During text processing, we may have to extract or remove certain text from the data to make it useful or we may also need to replace certain symbols and terms with other text to extract useful information. In this article, we will study about punctuation marks and will look at the methods to remove punctuation marks from python strings.What is a punctuation mark?There are several symbols in English grammar which include comma, hyphen, question mark, dash, exclamation mark, colon, semicolon, parentheses, brackets etc which are termed as punctuation marks. These are used in English language for grammatical purposes but when we perform text processing in python we generally have to omit the punctuation marks from our strings. Now we will see different methods to remove punctuation marks from a string in Python.Removing punctuation marks from string using for loop In this method,first we will create an empty python string which will contain the output string. Then we will simply iterate through each character of the python string and check if it is a punctuation mark or not. If the character will be a punctuation mark, we will leave it. Otherwise we will include it in our output string using string concatenation.For Example, In the code given below, we have each punctuation mark kept in a string named punctuation. We iterate through the input string myString using for loop and then we check if the character is present in the punctuation string or not. If it is not present, the character is included in the output string newString . punctuation= '''!()-[]{};:'"\, <>./?@#$%^&*_~'''print("The punctuation marks are:")print(punctuation)myString= "Python.:F}or{Beg~inn;ers"print("Input String is:")print(myString)newString=""for x in myString: if x not in punctuation: newString=newString+xprint("Output String is:")print(newString)OutputThe punctuation marks are:!()-[]{};:'"\, <>./?@#$%^&*_~Input String is:Python.:F}or{Beg~inn;ersOutput String is:PythonForBeginnersRemove punctuation marks from python string using regular expressionsWe can also remove punctuation marks from strings in python using regular expressions. For this we will use re module in python which provides functions for processing strings using regular expressions.In this method, we will substitute each character which is not an alphanumeric or space character by an empty string using re.sub() method and hence all of the punctuation will be removed.The syntax for sub() method is re.sub(pattern1, pattern2,input_string) where pattern1 denotes the pattern of the characters which will be replaced. In our case, we will provide a pattern which denotes characters which is not an alphanumeric or space character. pattern2 is the final pattern by which characters in pattern1 will be replaced. In our case pattern2 will be empty string as we just have to remove the punctuation marks from our python string. input_string is the string which has to be processed to remove punctuation.Example:import remyString= "Python.:F}or{Beg~inn;ers"print("Input String is:")print(myString)emptyString=""newString=re.sub(r'[^\w\s]',emptyString,myString)print("Output String is:")print(newString)OutputInput String is:Python.:F}or{Beg~inn;ersOutput String is:PythonForBeginnersRemove punctuation marks from python string using replace() methodPython string replace() method takes initial pattern and final pattern as parameters when invoked on a string and returns a resultant string where characters of initial pattern are replaced by characters in final pattern. We can use replace() method to remove punctuation from python string by replacing each punctuation mark by empty string. We will iterate over the entire punctuation marks one by one replace it by an empty string in our text string.The syntax for replace() method is replace(character1,character2) where character1 is the character which will be replaced by given character in the parameter character2. In our case, character1 will contain punctuation marks and character2 will be an empty string.punctuation= '''!()-[]{};:'"\, <>./?@#$%^&*_~'''myString= "Python.:F}or{Beg~inn;ers"print("Input String is:")print(myString)emptyString=""for x in punctuation: myString=myString.replace(x,emptyString)print("Output String is:")print(myString)Output:Input String is:Python.:F}or{Beg~inn;ersOutput String is:PythonForBeginnersRemove punctuation marks from python string using translate() methodThe translate() method replaces characters specified in the input string with new characters according to the translation table provided to the function as parameter. The translation table should contain the mapping of which characters have to be replaced by which characters. If the table does not have the mapping for any character, the character will not be replaced.The syntax for translate() method is translate(translation_dictionary) where the translation_dictionary will be a python dictionary containing mapping of characters in the input string to the characters by which they will be replaced.To create the translation table, we can use maketrans() method. This method takes the initial characters to be replaced, final characters and characters to be deleted from the string in the form of string as optional input and returns a python dictionary which works as translation table.The syntax for maketrans() method is maketrans(pattern1,pattern2,optional_pattern). Here pattern1 will be a string containing all the characters which are to be replaced. pattern2 will be a string containing the characters by which characters in pattern1 will be replaced. Here the length of pattern1 should be equal to length of pattern2. optional_pattern is a string containing the characters which have to be deleted from the input text. In our case, pattern1 and pattern2 will be empty strings while optional_pattern will be a string containing punctuation marks.To create a translation table for removing punctuation from python string, we can leave empty the first two parameters of maketrans() function and include the punctuation marks in the list of characters to be excluded. In this way all the punctuation marks will be deleted and output string will be obtained.Examplepunctuation= '''!()-[]{};:'"\, <>./?@#$%^&*_~'''myString= "Python.:F}or{Beg~inn;ers"print("Input String is:")print(myString)emptyString=""translationTable= str.maketrans("","",punctuation)newString=myString.translate(translationTable)print("Output String is:")print(newString)OutputInput String is:Python.:F}or{Beg~inn;ersOutput String is:PythonForBeginnersConclusionIn this article, we have seen how to remove punctuation marks from strings in python using for loop , regular expressions and inbuilt string methods like replace() and translate(). Stay tuned for more informative articles.The post How to remove punctuation from a Python String appeared first on PythonForBeginners.com.
#464 MARCH 16, 2021 View in Browser Pattern Matching Tutorial for Pythonic Code Structural pattern matching is coming in Python 3.10. This article explores how to use it to write Pythonic code by searching for some of the best use cases for the match statement. Keep in mind that match is still in alpha, so, while unlikely, some things may still change before the final version of 3.10 is released. RODRIGO GIR O SERR O Shared by Rodrigo Gir o Serr o PyQt6 vs PySide6: What’s the Difference Between the Two Python Qt Libraries? There is a new version of Qt (version 6) and with it new versions of PyQt and PySide now named PyQt6 & PySide6 respectively. Take a look at the latest versions of the libraries to identify the differences between them and find solutions for writing portable code. MARTIN FITZPATRICK Introducing App Platform, a new PaaS That Gets Your Apps to Market, Faster Get your apps to market faster with DigitalOcean’s App Platform. Build, deploy, and scale apps quickly using a simple, fully managed solution. We ll handle the infrastructure, app runtimes and dependencies, so that you can push code to production in just a few clicks. Get started w/ $100 free credit DIGITAL OCEAN sponsor Build a Contact Book With Python, PyQt, and SQLite In this step-by-step project, you’ll build a minimal contact book application using Python, with PyQt to build the application’s GUI and SQLite to handle the database. REAL PYTHON Rapid Prototyping with Flask, htmx, and Tailwind CSS Learn how to usee htmx and Tailwind CSS with Flask to quickly build interactive front-ends. AMAL SHAJI Shared by Amal Shaji Mu Version 1.1.0-Beta.2 Released MADEWITH.MU Discussions Why Are tar.xz Files 15x Smaller When Using Python’s Tar Compared to macOS Tar? HACKER NEWS How Can You Quickly Get the Last Line of a Huge CSV File With 48 Million Lines? STACK OVERFLOW Python Jobs How Strong Is Your Resume?sponsor Get a free, confidential review from a resume expert Senior Backend Developer (Berlin, Germany) orderbird AG Full-Stack Django Developer (Oslo, Norway) unifai Senior Python Engineer (Remote) EAB Back End Developer (Remote) Interplay Learning More Python Jobs >>> Articles & Tutorials How to Speed Up Pandas With Modin The pandas library provides easy-to-use data structures like pandas DataFrames as well as tools for data analysis. One issue with pandas is that it can be slow with large amounts of data. It wasn t designed for analyzing 100 GB or 1 TB datasets. Fortunately, there is the Modin library which has benefits like the ability to scale your pandas workflows by changing one line of code. MICHAEL GALARNYK Shared by Michael Galarnyk Navigating Options for Deploying Your Python Application What goes into the decision of how to host your Python code or application in the cloud? Which technology stack is the right size for your project? This week on the show, we have Calvin Hendryx-Parker. Calvin talks about cloud hosting options, infrastructure choices, and deployment tools. REAL PYTHON podcast Understand the Architecture Around your Python Apps in Containers and Serverless Environments Epsagon lets dev teams see and understand dependencies and API integrations in microservices architecture. It s a Microservices Observability SaaS with monitoring and investigative tools to drill-down and explore modern workloads. Setup Epsagon and see your python services in minutes EPSAGON sponsor Interfaces and Protocols Now that Python developers have Mypy and Python’s built-in Protocol, is zope.interface a thing of the past? Can you replace every zope.interface with a Protocol? See how these two approaches to abstract types compare, and when it might make sense to stick with zope.interface. GLYPH LEFKOWITZ Python Turns 30: Meet the Man That Helps Keep the Programming Language on Track Pablo Galindo is a physicist, Python core-dev, and member of the Python steering council. He’s also the release manager for Python 3.10 and 3.11. Learn how he got involved with Python and read about his thoughts on the role of the steering council and Python’s future. MAYANK SHARMA RSVP for the 3rd Annual Python Web Conference (Virtual) | March 22-26, 2021 International experts share best practices for hard Python problems. 50+ talks on Machine Learning, AI, Big Data, Django, Plone, CI/CD, Containers, Serverless, web security, etc. Join JetBrains and Six Feet Up to discuss what the future holds. SIX FEET UP sponsor Python Booleans: Leveraging the Values of Truth Learn about the built-in Python Boolean data type, which is used to represent the truth value of an expression. You’ll see how to use Booleans to compare values, check for identity and membership, and control the flow of your programs with conditionals. REAL PYTHON course Performance Comparison: Counting Words in Python, Go, C++, C, AWK, Forth, and Rust This article is less of a “my language is better than your” rant and more of an exploration into what idiomatic vs. optimized code looks like in various languages and where surprising bottlenecks lurk. BEN HOYT Announcing Beanie: An Asynchronous MongoDB ODM Beanie is a new asynchronous Python ODM (Object Document Mappter) for MongoDB based on Motor and Pydantic. Learn how Beanie’s data model works and see it in action in a minimal web application. ROMAN Python Community Interview With Ewa Jodlowska Learn how Ewa Jodlowska, the executive director of the Python Software Foundation (PSF), started her tech journey and how COVID-19 affected the PSF and plans for PyCon US 2021. REAL PYTHON Projects & Code lucid-sonic-dreams: Sync GAN-generated Visuals to Music GITHUB.COM/MIKAELALAFRIZ python-fpe: Format Preserving Encryption Implementation in Python GITHUB.COM/MYSTO Hyperactive: A Hyperparameter Optimization and Data Collection Toolbox GITHUB.COM/SIMONBLANKE playback: Record Your Service Operations in Production and Replay Them Locally GITHUB.COM/OPTIBUS beanie: Micro ODM for MongoDB GITHUB.COM/ROMAN-RIGHT Events Real Python Office Hours (Virtual) March 17, 2020 REALPYTHON.COM Python Web Conference 2021 March 22 to March 27, 2021 PYTHONWEBCONFERENCE.COM An Introduction to Delivering Technical Presentations With Confidence March 25, 2021 MEETUP.COM PyCon Israel 2021 (Virtual) May 2 3, 2021 PYCON.ORG.IL PyCon 2021 (Virtual) May 12 18, 2021 PYCON.ORG DjangoCon Europe 2021 (Virtual) June 2 6, 2021 DJANGOCON.EU PyCon Cameroon 2021 March 18 to March 21, 2021 PYTHONCM.ORG Happy Pythoning!This was PyCoder’s Weekly Issue #464.View in Browser [ Subscribe to PyCoder’s Weekly Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]
Real Python: Python Booleans: Leveraging the Values of Truth
Understanding how Python Boolean values behave is important to programming well in Python. The Python Boolean type is one of Python’s built-in data types. It’s used to represent the truth value of an expression. For example, the expression 1 <= 2 is True, while the expression 0 == 1 is False. In this course, you’ll learn how to:Use Python Booleans to write efficient and readable Python codeManipulate Boolean values with Boolean operatorsConvert Booleans to other typesConvert other types to Python Booleans [ Improve Your Python With Python Tricks Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Suppose you need to scrape data from a website after translating the web page in R and Python. In google chrome, there is an option (or functionality) to translate any foreign language. If you are an english speaker and don't know any other foreign language and you want to extract data from the website which does not have option to convert language to English, this article would help you how to perform translation of a webpage. What is Selenium?You may not familiar with Selenium so it is important to understand the background. Selenium is an open-source tool which is very popular in testing domain and used for automating web browsers. It allows you to write test scripts in several programming languages. Selenium is available in both R and Python. Translate Page in Web Scraping in R and PythonIn R there is a package named RSelenium whereas Selenium can be installed by installing selenium package in Python. Following is a list of languages chrome supports along with their code. You need this code in making chrome understand from which language to what language you want to translate the web page. NameCode AmharicamArabicarBasqueeuBengalibnEnglish (UK)en-GBPortuguese (Brazil)pt-BRBulgarianbgCatalancaCherokeechrCroatianhrCzechcsDanishdaDutchnlEnglish (US)enEstonianetFilipinofilFinnishfiFrenchfrGermandeGreekelGujaratiguHebrewiwHindihiHungarianhuIcelandicisIndonesianidItalianitJapanesejaKannadaknKoreankoLatvianlvLithuanianltMalaymsMalayalammlMarathimrNorwegiannoPolishplPortuguese (Portugal)pt-PTRomanianroRussianruSerbiansrChinese (PRC)zh-CNSlovakskSlovenianslSpanishesSwahiliswSwedishsvTamiltaTeluguteThaithChinese (Taiwan)zh-TWTurkishtrUrduurUkrainianukVietnameseviWelshcyREAD MORE
Podcast.__init__: Practical Advice On Using Python To Power A Business
Python is a language that is used in almost every imaginable context and by people from an amazing range of backgrounds. A lot of the people who use it wouldn't even call themselves programmers, because that is not the primary focus of their job. In this episode Chris Moffitt shares his experience writing Python as a business user. In order to share his insights and help others who have run up against the limits of Excel he maintains the site Practical Business Python where he publishes articles that help introduce newcomers to Python and explain how to perform tasks such as building reports, automating Excel files, and doing data analysis. This is a great conversation that illustrates how useful it is to learn Python even if you never intend to write software professionally.SummaryPython is a language that is used in almost every imaginable context and by people from an amazing range of backgrounds. A lot of the people who use it wouldn’t even call themselves programmers, because that is not the primary focus of their job. In this episode Chris Moffitt shares his experience writing Python as a business user. In order to share his insights and help others who have run up against the limits of Excel he maintains the site Practical Business Python where he publishes articles that help introduce newcomers to Python and explain how to perform tasks such as building reports, automating Excel files, and doing data analysis. This is a great conversation that illustrates how useful it is to learn Python even if you never intend to write software professionally.AnnouncementsHello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to pythonpodcast.com/census today to get a free 14-day trial.Your host as usual is Tobias Macey and today I’m interviewing Chris Moffitt about how Python is used to help manage business needs and processes and his work to share advice on this topic at Practical Business PythonInterviewIntroductionsHow did you get introduced to Python?Can you start by giving an overview of your mission at Practical Business Python?What was your inspiration for starting the site and what keeps you motivated?What are some of the kinds of problems that a business user is looking to solve for themselves?Why is Python a viable tool for a business user to become familiar with?How would you characterize the difference between the ways that a software engineer and a business user approach Python?What do you see as the tipping point of complexity or time investment past which a business user will pass a given project on to a software engineer?How much familiarity with adjacent concerns such as version control, software design, etc. do you consider useful for a business user?What are some of the ways that you use Python in your day-to-day?What are some of the onramps for integrating Python into a user’s workflow?What are some common stumbling blocks that business users run into when getting started with Python?What are some of the most interesting, innovative, or impressive ways that you have seen Python employed by business users?What are some of the most interesting, unexpected, or challenging lessons that you have learned while working on the Practical Business Python site?What are some cases where you would advocate for a tool other than Python for a business use case?What do you have planned for the future of the site?Keep In TouchLinkedInchris1610 on GitHub@chris1610 on TwitterPicksTobiasThe Data Science Roundup NewsletterThis Week In Data NewsletterChris MoffittLine Of Duty BBC SeriesOut Of The Dark by David WeberClosing AnnouncementsThank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workersJoin the community in the new Zulip chat workspace at pythonpodcast.com/chatLinksPractical Business Python blogElectrical EngineeringUnixPerlData ScienceDjangoRaspberry PiPandasExcelVBA == Visual Basic for ApplicationsVSCodeExcel PowerFXPathlibCondaPython WheelsPEP 582SAPSalesforceTableauProphet library for timeseries forecastingTalk Python Course Moving From Excel To PythonThe intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Codementor: Quick tip: How I use pip-tools to wrangle dependencies
pip-tools is a Python development tool for helping you ensure you have deterministic and predictable builds. The best way I can think of what that means is by example. Let's say you clone a project...
After months of work, by many contributors, we re delighted to announcethe release of Mu 1.1.0 beta 2. This is the version we recommend you use, andyou should update to this version via our official installers that you candownload from here.The full list of updates can be found inour CHANGELOG. Of course,this being a beta release, we expect there to be bugs. Please don thesitate to give feedback andreport any problems you may find.We ll be making further beta releases as this past year s efforts cometogether, and we continue to develop new features, fix bugs and enhance Mu.You are most welcome to contribute via ourGitHub repository. All our previousreleases are available via GitHub.I want to make special mention and celebrate the work of: Long time Mu contributor Tim Golden who(foolishly?) offered to help with Mu s handling of Python packages. His epicand Herculean efforts ensure Mu uses Pythonic virtual environments forpackage management. Thank you for this outstanding contribution Tim. University lecturerMartin Dybdal hasmade many numerous changes to Mu. The editor is vastly improved byMartin s fearless quest to improve all aspects of the code base. He has alsobeen the driving force behind the new ESP mode that he uses to teach hisstudents aboutESP based MicroPython wrist-watches.Tak for al din hj lp Martin. Carlos Pereira Atencio of the MicroBitFoundation has been a formidable and talented contributor to Mu from the verybeginning. Without Carlos s exceptional work, touching all aspects of Mu, ourefforts would be significantly impaired. As an embedded engineer hisMicroPython related insights have been invaluable and it is Carlos who hasensured Mu works with the super-duper version 2 of the MicroBit. Muchasgracias Carlos. The ever mysterious Zander Brown (Mu sanswer to The Stig) continues tocontribute help and support to our users via ouronline discussion forum. As a studenthimself he knows how important it is to get the right sort of help, andZander is ever patient, polite and positive in his mentorship role. Thank youZander for making Mu s online presence such a positive and collaborativespace. Finally, the Pythonic force-of-nature that isTiago Montes has, through his magnificent workon PUP, made packaging Mu a joy.As those who know about packaging software willtell you, this is an incredible achievement and means we can cut a newlypackaged release within minutes, rather than hours (as was before). Tiago sinfectious postivity, enthusiasm and humour have sustained many of us overthe course of the recent efforts on Mu. Obrigado por um trabalho t o incr velTiago. :-)Finally, what happened to the beta 1 release..?It was essentially a dressrehearsal and bug shake down that took place at the beginning of February. Welearned a lot (and fixed many bugs) as a result of that beta, but it wasn tever ready for public view.Many thanks to those who continue to offer support to Mu. Many thanks to thosewho teach with Mu - you re helping the engineers and developers of tomorrowfind their feet and flourish. Many thanks to those beginners who, despite howintimidating it must feel, engage with us and help us to improve Mu ~ you areespecially welcome here.
Real Python: Python Community Interview With Ewa Jodlowska
Today I m joined by Ewa Jodlowska, executive director of the Python Software Foundation (PSF), the organization devoted to advancing open source technology related to the Python programming language.In this interview, we discuss how Ewa started her tech journey, how COVID-19 affected the PSF, plans for PyCon US 2021, her love of hiking and lifting weights, and much more.Ricky: Thank you for joining me, Ewa. You ve been at the PSF for over nine years at this point, first as the events coordinator, then as the director of operations, and now as the executive director. I m curious to know a little about your background, how you found your way in the PSF, and why you re so passionate about Python.Ewa: That is a great question and not one I get asked often!I was first introduced to PyCon through my previous employer, where I was a meeting planner, account manager, and eventually a software engineer! We were contracted in 2008 to help out in several ways: We helped with logistical conference planning and eventually built a registration site for PyCon and managed hotel reservations. Back then, we were programming using PHP and Informix 4GL.I implemented many first-time registration functionalities, such as how people signed up for tutorials! Of course, PyCon has its own system now, but the flow of it is still based on what was created for PyCon 2009.Going to PyCon in 2008 and 2009 inspired me to get my CS degree through night school. Even though it didn t offer any Python courses, it helped uncover more of the tech scene for me.In 2011 I left my previous employer to adventure around Europe for a couple of years, and the PSF offered me a part-time position to work on PyCon! By June of 2012 I was offered full-time employment since PyCon really took off when it was in Santa Clara, California. A couple months later, the PSF s part-time administrator left, and that responsibility was added to my role. Through that role I received a lot of exposure to what the PSF did outside of PyCon and our wonderful community. From that time, the PSF really began to flourish, and the support we provided (and continue to provide) to the Python community continued to evolve. Read the full article at https://realpython.com/interview-ewa-jodlowska/ [ Improve Your Python With Python Tricks Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Stack Abuse: How to Convert DOCX To Html With Python Mammoth
IntroductionAt some point in your software development path, you'll have to convert files from one format to another.DOCX (used by Microsoft Word) is a pretty common file format for a lot of people to use. And sometimes, we'd like to convert Word Documents into HTML.This can easily be achieved via the Mammoth package. It's an easy, efficient, and fast library used to convert DOCX files to HTML. In this article, we'll learn how to use Mammoth in Python to convert DOCX to HTML.Installing MammothAs a good practice, remember to have your virtual environment ready and activated before the installation:$ python3 -m venv myenv$ . myenv/bin/activateLet's then install Mammoth with pip:$ pip3 install mammothThis tutorial uses Mammoth version 1.4.15. Here's a sample document you can use throughout this tutorial. If you have a document to convert, make sure that it's a .docx file!Now that you're ready to go, let's get started with extracting the text and writing that as HTML.Extract the Raw Text of a DOCX FilePreserving the formatting while converting to HTML is one of the best features of Mammoth. However, if you just need the text of the DOCX file, you'll be pleasantly surprised at how few lines of code are needed.You can use the extract_raw_text() method to retrieve it:import mammothwith open(input_filename, "rb") as docx_file: result = mammoth.extract_raw_text(docx_file) text = result.value # The raw text with open('output.txt', 'w') as text_file: text_file.write(text)Note that this method does not return a valid HTML document. It only returns the text on the page, hence why we save it with the .txt extension. If you do need to keep the layout and/or formatting, you'll want to extract the HTML contents.Convert Docx to HTML with Custom Style MappingBy default, Mammoth converts your document into HTML but it does not give you a valid HTML page. While web browsers can display the content, it is missing an <html> tag to encapsulate the document, and a <body> tag to contain the document. How you choose to integrate its output is up to you. Let's say you're using a web framework that has templates. You'd likely define a template to display a Word Document and load Mammoth's output inside the template's body.Mammoth is not only flexible with how you can use its output but how you can create it as well. Particularly, we have a lot of options when we want to style the HTML we produce. We map styles by matching each DOCX formatting rule to the equivalent (or as close as we can get) CSS rule.To see what styles your DOCX file has, you have two options:You can open your docx file with MS Word and check the Styles toolbar.You can dig into the XML files by opening your DOCX file with an archive manager, and then navigate to the /word/styles.xml and locate your styles.The second option can be used by those who don't have access to MS Word or an alternative word processor that can interpret and display the styles.Mammoth already has some of the most common style maps covered by default. For instance, the Heading1 docx style is mapped to the <h1> HTML element, bold is mapped to the <strong> HTML element, etc.We can also use Mammoth to customize the document's styles while mapping them. For example, if you wanted to change all bold occurrences in the DOCX file to italic in the HTML, you can do this:import mammothcustom_styles = "b => i"with open(input_filename, "rb") as docx_file: result = mammoth.convert_to_html(docx_file, style_map = custom_styles) text = result.value with open('output.html', 'w') as html_file: html_file.write(text)With the custom_styles variable, the style on the left is from the DOCX file, while the one on the right is the corresponding CSS.Let's say we wanted to omit the bold occurrences altogether, we can leave the mapping target blank:custom_styles = "b => "Sometimes the document we're porting has many styles to retain. It quickly becomes impractical to create a variable for every style we want to map. Luckily we can use docstrings to map as many styles as we want in one go:custom_styles = """ b => del u => em p[style-name='Heading 1'] => i"""You may have noticed that the last mapping was a bit different from the others. When mapping styles, we can use square brackets [] with a condition inside them so that only a subset of elements are styled that way.In our example, p[style-name='Heading 1'] selects paragraphs that has a style name Heading 1. We can also use p[style-name^='Heading'] to select each paragraph that has a style name starting with Heading.Style mapping also allows us to map styles to custom CSS classes. By doing so, we can shape the style of HTML as we like. Let's do an example where we define our basic custom CSS in a docstring like this:custom_css =""" <style> .red{ color: red; } .underline{ text-decoration: underline; } .ul.li{ list-style-type: circle; } table, th, td { border: 1px solid black; } </style> """Now we can update our mapping to reference the CSS classes we've defined in the <style> block:custom_styles = """ b => b.red u => em.red p[style-name='Heading 1'] => h1.red.underline"""Now all we need to do is merge the CSS and the HTML together:edited_html = custom_css + htmlIf your DOCX file has any of those elements, you will be able to see the results.Now that we know how to map styles, let's use a more well-known CSS framework (along with the JS) to give our HTML a better look and practice a more likely real-life scenario.Mapping Styles With Bootstrap (or Any Other UI Framework)Just like we did with the custom_css, we need to ensure that the CSS is loaded with the HTML. We need to add the Bootstrap file URI or CDN to our HTML:bootstrap_css = '<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.0-beta2/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-BmbxuPwQa2lc/FVzBcNJ7UAyJxM6wuqIj61tLrc4wSX0szH/Ev+nYRRuWlolflfl" crossorigin="anonymous">'bootstrap_js = '<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.0.0-beta2/dist/js/bootstrap.bundle.min.js" integrity="sha384-b5kHyXgcpbZJO/tY9Ul7kGkf1S0CWuKcCD38l8YkeH8z8QjE0GmW1gYU5S9FOnJ0" crossorigin="anonymous"></script>'We'll also slightly tweak our custom_styles to match our new CSS classes:custom_styles = """ b => b.mark u => u.initialism p[style-name='Heading 1'] => h1.card table => table.table.table-hover """In the first line, we're mapping bold DOCX style to the b HTML element with a class mark, which is a Bootstrap class equivalent of the HTML <mark> tag, used for highlighting part of the text.In the second line, we're adding the initialism class to the u HTML element, slightly decreasing the font size and transforming the text into the uppercase.In the third line, we're selecting all paragraphs that have the style name Heading 1 and converting them to h1 HTML elements with the Bootstrap class of card, which sets multiple style properties such as background color, position, and border for the element.In the last line, we're converting all tables in our docx file to the table HTML element, with Bootstrap's table class to give it a new look, also we're making it highlight when hovered, by adding the Bootstrap class of table-hover.Like before, we use dot-notation to map multiple classes to the same HTML element, even though the styles come from another source.Finally, add the Bootstrap CDNs to our HTML:edited_html = bootstrap_css + html + bootstrap_jsOur HTML is now ready to be shared, with a polished look and feel! Here's the full code for reference:import mammothinput_filename = "file-sample_100kB.docx"custom_styles = """ b => b.mark u => u.initialism p[style-name='Heading 1'] => h1.card table => table.table.table-hover """bootstrap_css = '<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.0-beta2/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-BmbxuPwQa2lc/FVzBcNJ7UAyJxM6wuqIj61tLrc4wSX0szH/Ev+nYRRuWlolflfl" crossorigin="anonymous">'bootstrap_js = '<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.0.0-beta2/dist/js/bootstrap.bundle.min.js" integrity="sha384-b5kHyXgcpbZJO/tY9Ul7kGkf1S0CWuKcCD38l8YkeH8z8QjE0GmW1gYU5S9FOnJ0" crossorigin="anonymous"></script>'with open(input_filename, "rb") as docx_file: result = mammoth.convert_to_html(docx_file, style_map = custom_styles) html = result.value edited_html = bootstrap_css + html + bootstrap_jsoutput_filename = "output.html"with open(output_filename, "w") as f: f.writelines(edited_html)Also, another point to note here that in a real-life scenario, you probably will not add Bootstrap CSS directly to the HTML content as we did here. Instead, you would load/inject the HTML content to a prepacked HTML page, which already would have the necessary CSS and JS bundles.So far you've seen how much flexibility we have to style our output. Mammoth also allows us to modify the content we're converting. Let's take a look at that now.Dealing With Images We Don't Want SharedLet's say we'd like to omit images from our DOCX file from being converted. The convert_to_html() accepts a convert_image argument, which is an image handler function. It returns a list of images, that should be converted and added to the HTML document.Naturally, if we override it and return an empty list, they'll be omitted from the converted page:def ignore_image(image): return []Now, let's pass that function as a parameter into the convert_to_html() method:with open(input_filename, "rb") as docx_file: result = mammoth.convert_to_html(docx_file, style_map = custom_styles, convert_image=ignore_image) html = result.value with open('output.html', 'w') as html_file: html_file.write(text)That's it! Mammoth will ignore all the images when generating an HTML file.We've been programmatically using Mammoth with Python so far. Mammoth is also a CLI tool, therefore we have another interface to do DOCX to HTML conversations. Let's see how that works in the next section.Convert DOCX to HTML Using Command Line ToolFile conversion with Mammoth, using the CLI, typically looks like this:$ mammoth path/to/input_filename.docx path/to/output.htmlIf you wanted to separate the images from the HTML, you can specify an output folder:$ mammoth file-sample_100kB.docx --output-dir=imgsWe can also add custom styles as we did in Python. You need to first create a custom style file:$ touch my-custom-stylesThen we'll add our custom styles in it, the syntax is same as before:b => b.redu => em.redp[style-name='Heading 1'] => h1.red.underlineNow we can generate our HTML file with custom style:$ mammoth file-sample_100kB.docx output.html --style-map=my-custom-stylesAnd you're done! Your document would have been converted with the defined custom styles.ConclusionFile typecasting is a common situation when working on web technologies. Converting DOCX files into well-known and easy to manipulate HTML allows us to reconstruct the data as much as we need. With Mammoth, we've learned how to extract the text from a docx and how to convert it to HTML.When converting to HTML we can style the output with CSS rules we create or ones that come with common UI frameworks. We can also omit data we don't want to be available in the HTML. Lastly, we've seen how to use the Mammoth CLI as an alternative option for file conversion.You can find a sample docx file along with the full code of the tutorial on this GitHub repository.
Learn PyQt: PySide6 Book now available: Create GUI Applications with Python & Qt6 The hands-on guide to making apps with Python
Hello! This morning I released the first Qt6 edition of my PySide book Create GUI Applications, with Python & Qt6.This update follows the 4th Edition of the PySide2 book updating all the code examples and adding additional PySide6-specific detail.The book contains 600+ pages and 200+ complete code examples taking you from the basics of creating PySide applications to fully functional apps.To celebrate the milestone, the book is available this week with 20% off. As with earlier editions, readers get access to all future updates for free -- so it's a great time to snap it up! You'll also get a copy of the PyQt5 and PySide2 editions.If you previously bought a copy of my Qt5 books (for PyQt5 or PySide2) you get this update for free! Just log into your account on LearnPyQt and you'll find the book already waiting for you under "My Books & Downloads".The PyQt6 edition will be released shortly. If you have any questions or difficulty getting hold of this update, just get in touch.Enjoy! See the complete PyQt5 tutorial, from first steps to complete applications with Python & Qt5.
Postgres For Those Who Can't Even, Part 3 - In The Real World
This is part 3 of a series of posts I m writing for Friendo, a web person who wants to get their hands a lot dirtier with Node and Postgres. You can read part 1 here, and part 2 here, where we say hello to Postgres and learn how to use it with Node.In this post we ll depart from the fun happy demo world where everything just works super cool isn t that awesome? and into the weeds, trying to build an application foundation that won t utterly suck in 6 months. If you want to look over/play with the code it s right here although it s still super preview.The short summary here is that I wanted to build something reasonably real and sort of went off and built something kinda real that I would use in production today. It was fun to put together and I thought I would share it with you.Lots to get into, let s go.SQL Leave It To The ProsPoeple don t like seeing SQL in their code for a variety of mostly silly reasons: SQL doesn t scale SQL === Injection Attacks Junior devs won t understand it It s ugly and hard to work with Injection attacks are definitely something you ll want to worry about no matter what tool you use. Also: if you don t know SQL then yes, it can appear verbose and daunting. To me, however, there s a major reason you should use a simple abstraction: future you will be confused.It can be really difficult to transition your brain from reading application code and tests to SQL, trying to reconcile what a query might do or return. Code is indeed a bit more expressive in this way, and it s easier to refactor.Let s Use MassiveJSMy favorite DB tool is one that I created many years ago and has since been taken over by the ultra freaky amazing Dian Fay - MassiveJS. There s a lot to explain about how it works, so I ll just cut to the reasons I like it: It s a dedicated Postgres tool that flexes the amazing features of Postgres and It has built-in support for JSONB. We like this.The idea here is that we can start using JSONB, saving our models as documents and then, if we want, we can move to using a relational structure with very few code changes. Best of both worlds!Booting MassiveJS In An Express AppFinally, some code! When MassiveJS boots up it scans your database and reads in your table information - column names, keys, etc. This is an asynchronous process which means we need to do change the way our app is booting.I like to have everything in an `app.js file so it s right in front of me, and I set everything inside of an async start function:var createError = require('http-errors');var express = require('express');var path = require('path');var cookieParser = require('cookie-parser');var logger = require('morgan');var session = require("express-session");var http = require('http');const bodyParser = require("body-parser");const Massive = require("massive");const start = async function(){ var app = express(); const db = await Massive(process.env.DATABASE_URL); app.set("db", db); //...}start.then(app => { const port = app.get("port"); console.log(`App is running on port ${port}`);});This allows us to await any boot up stuff that is asynchronous - like Massive.By default, the server boot stuff in Express is in bin/www but I like the app bits right in front of me so I know what s happening.The App Stuff, however, goes in it s own boot file.The Boot FileA practice in Sinatra land (Ruby) is to have a config/boot file where everything is loaded up. This can be a single file or, if you re kicking up a lot of stuff, multiple files.I decided to do a single boot file config/boot.js and in there I do app specific initialization. Not the plumbing of Express or Express middleware - the things I care about with the app itself:const Auth = require("../lib/auth");const Mail = require("../mail");const Passport = require("./passport");const Massive = require("massive");const consola = require("consola");const settings = require("../package.json");require("dotenv").config();exports.theApp = async function(app){ let rootUrl = `http://localhost:${app.get("port")}`; if(process.env.NODE_ENV === "production" && settings.azure){ rootUrl = settings.azure.siteUrl; } //set the root URL for use throughout the app app.set("rootUrl", rootUrl); consola.info(`Connecting to ${process.env.DATABASE_URL}`) //spin up massive... yay! const db = await Massive(process.env.DATABASE_URL); app.set("db", db); consola.info("Initializing Auth service..."); Auth.init({db:db}); consola.info("Initializing Passport service..."); let passport = Passport.init({ Auth: Auth, GoogleSettings: { clientID: process.env.GOOGLE_ID, clientSecret: process.env.GOOGLE_SECRET, callbackUrl: `${rootUrl}/auth/google/callback` }, GithubSettings: { clientID: process.env.GITHUB_ID, clientSecret: process.env.GITHUB_SECRET, callbakUrl: `${rootUrl}/auth/github/callback`, scope: ["user:email"] } }); consola.info("Initializing email...") Mail.init({ host: process.env.SMTP_HOST, user: process.env.SMTP_USER, password: process.env.SMTP_PASSWORD }) app.use(passport.initialize()); app.use(passport.session());}There are a number of things going on here - enough so that I will definitely need a Part 4 (so I can explain what settings.azure is). Hopefully this looks like what it is: where everything is configured. Some people put configuration stuff inside of a module that s OK but I think it s much easier to put it one place where you have access to your app, your db and other services.If you squint your eyes it s almost like poor-person s IoC, where you initialize your stuff in a central spot using environment variables instead of some wonky XML configuration or other nonsense.So there s a lot going on here, for now let s focus on Auth. Once Massive is instantiated, you ll notice that I m passing that instance to an Auth module. This is because Node s modules are singletons by default, and that s exactly what we want because Massive creates a connection pool to Postgres. If we accidentally create multiple instances of Massive, we ll have multiple pools which can cause our app to crash as well as make Massive complain constantly.I know there will always be only one instance because I can see it right here in my config. If other classes or services need it, I ll pass it on on boot. I ll get more into this below.Authentication, Built InThe Tailwind Starter CSS Kit from Creative TimI wanted a specific use case for working with PostgreSQL on Azure and that grew into something I ve wanted for a really long time: a super simple starter site without all the cruft and noise. I m a little bit opinionated on things and I like it when an application design is as simple and straightforward as possible with room to grow.As you can see above, I m using PassportJS to handle OAuth. There s enough going on with Passport that I decided to give it its own boot file, which you can see in the repo.Here s where we get to the meat of the matter, however: where does the auth logic live? The simplest approach is to just drop it on a User model:class User{ constructor(args){ //init stuff } register({name, email, password}){ //... } login({email, password}){ //... }}This seems like a straightforward thing but it s not because I need to work with the database at some point, which means I either need to require a database instance somewhere or pass it in through the constructor.That s becomes a mess, fast. I could try and to an ActiveRecord type of base class, which is where I was headed before because it s simple - but (in my experience) this is the EXACT kind of emerging technical debt that I don t want to deal with. This is my opinion based on my experience, but orchestrating logic with data-aware models leads to a giant mess.Probably because I m not disciplined enough to find rugs to shove my code mess under - but I d rather do something a bit cleaner.The Auth ServiceThe next logical step is to have a service class that deals solely with Auth stuff , like registering, logging in, changing passwords, etc. You could spread this around multiple files or, to start, just a single file as I have.This is where I constantly find myself when working with Node: do I keep this module a singleton or export a class? I almost always go with the class option so I can pass in whatever config stuff I need to:class Auth{ constructor(args){ //set things }}exports.init = function({db}){ assert(db, "Need a db instance here"); return new Auth({db:db});}This is a typical factory pattern where you don t allow direct access to a class s constructor and, instead, use a method that describes what you re trying to do.This works, but it has a drawback: you can t get to the Auth instance unless you call init(). That s a pain! I need this service in at least three spots: The boot file The auth route file The passport config fileIdeally, I could just require the Auth service and it shows up and works! This is where Node s singleton thing comes in ultra handy. Instead of exporting a class instance, I ll just set some variables on the singleton:let db = nullexports.authenticate = function({email, password}){ //... use the db in here}exports.register = function({name, email, password}){ //... use the db in here}exports.init = function(args){ assert(args.db, "Need a db instance here"); db = args.db}This works great but only if you re making sure the boot file is called before any app code. But you would do that anyway, wouldn t you?Here s what it looks like in my editor:Notice that last line there circled in red? That s Massive doing it s thing with a document: db.users.saveDoc(user)`. That s one of the main reasons I love working with Massive - you can start out saving your models as documents and, later on, if you want to get relational about it go ahead! I really dislike migrations, but I LOVE Postgres for it s data rules and speed. The best of both worlds.Next Time: DeploymentOne of the things I also added to this starter site is an in-house deployment setup using Azure. I love the way Heroku works - cuddling up to your app and helping you seamlessly deploy it - so I added a version of that experience to this starter app.I added Commander from TJ because I think every app - even a web app - should have a CLI. I added a bunch of commands and a few other things to streamline the deployment experience and it all worked pretty well!You start with some Q&A, asking about where, what size web and DB servers, and your configuration is set for you in your package.json:There are no passwords in here - they re stored (for now) in a local DB in the root of the CLI which DOES NOT get committed. You can view everything, if you want, using a simple CLI command:All of your deployment users and passwords are generated for you, as well as you app s name, app servic plan name and so on. Again - I ll blog more about this in the next post but the idea here is to make this as seamless and simple as possible.And guess what! It works pretty well As you can see in the output, a remote git repo is setup for you locally so you can push/deploy using Git. A resource group, service plan, app settings, database along with database settings and even initialization with a users table - it s all done and ready.Within 8 minutes you can browse your site or, if you want, explore your production database using psql locally.I threw in a bunch more stuff too - things that I ve always wanted at the ready when working with a cloud provider (like logging, open my site for me please, back up my db here please, etc). I ll share more in the next post.SummaryThis post had less to do with Postgres I suppose - more with how you work with it.
Postgres For Those Who Can t Even, Part 2 - Working with Node and JSON
This is part 2 of a series of posts I m doing for a friend who s a JavaScript developer that, according to him, knows next to nothing about Postgres. You can read part 1 right here.I write a lot about Postgres, but I don t think I ve written enough about how to get started from the absolute beginning, so that s what we re doing here.In this post, I m continuing with his questions to me - but this time it has less to do with the database side of things and more to do with Node and how you can use Postgres for fun and profit. Let s roll.How should I structure my code?This question has more to do with your preferences or what your company/boss have set up. I can show you how I do things, but your situation is probably a lot different.OK, enough prevaricating. Here s what I ve done in the past with super simple projects that where I m just musing around.Give PG It s Own ModuleI like putting all my code inside of a lib directory, and then inside there I ll create a a pg directory with specific connection things etc for Postgres. It looks like this:You ll also notice I have a .env file, which is something that goes into every single project of mine. It s a file that holds environmental variables that I ll be using in my project. In this case, I do not want my connection string hardcoded anywhere - so I pop it into a .env file where it s loaded automatically by my shell (zshell and, for those interested, I use the dotenv plugin with Oh-My-Zsh).There s a single file inside of the lib/pg directory called runner.js, and it has one job: run the raw SQL queries using pg-promise:const pgp = require('pg-promise')({});const db = pgp(process.env.DATABASE_URL);exports.query = async function(sql, args){ const res = await db.any(sql, args); return res;}exports.one = async function(sql, args){ const res = await db.oneOrNone(sql, args); return res;}exports.execute = async function(sql, args){ const res = await db.none(sql, args); return res;}exports.close = async function(){ await db.$pool.end(); return true;}I usually have 3 flavors of query runners: One that will return 0 to n records One that will return a single record One that executes a passthrough query that doesn t return a resultI also like to have one that closes the connections down. Normally you wouldn t call this in your code because the driver (which is pg-promise in this case) manages this for you and you want to be sure you draw on its pool of connections - don t spin your own. That said, sometimes you might want to run a script or two, maybe some integration tests might hit the DB - either way a graceful shutdown is nice to have.We can use this code in the rest of our app:const pg = require("./lib/pg/runner");pg.query("select * from master_plan limit 10") .then(console.log) .catch(console.error) .finally(pg.close)Neat! It works well but yes, we ll end up with SQL all over our code so let s fix that.A Little Bit of AbstractionThe nice thing about Node is that your modules can be single files, or you can expand them to be quite complex - without breaking the code that depends on them. I don t want my app code to think about the SQL that needs to be written - I d rather just offer a method that gives the data I want. In that case, I ll create an index.js file for my pg module, which returns a single method for my query called masterPlan:const runner = require("./runner");exports.masterPlan = function(limit=10){ return runner.query(`select * from master_plan limit ${limit}`)}exports.shutDown = function(){ runner.close();}The runner here is the same runner that I used before, this time it s in the same directory as the calling code. I ve exposed two methods on the index as that s all I need for right now. This is kind of like a Repository Pattern, which comes with a few warnings attached.People have been arguing about data access for decades. What patterns to use, how those patterns fit into the larger app you re building, etc, etc, etc. It s really annoying.Applications always start small and then grow. That s where the issues come in. The Repository Pattern looks nice and seems wonderful until you find yourself writing Orders.getByCustomer and Customer.getOrders, wondering if this is really what you wanted to do with your life.This is a rabbit hole I don t want to go down further so, I ll kindly suggest that if you have a simple app with 10-20 total queries, this level of control and simplicity of approach might work really well. If your app will grow (which I m sure it will whether you think so or not), it s probably a good idea to use some kind of library or relational mapper (ORM), which I ll get to in just a minute.How do I put JSON in it?One of the fun things about Node is that you can work with JSON everywhere. It s fun, I think, to not worry about data types, migrations, and relational theory when you re trying to get your app off the ground.The neat thing about Postgres is that it supports this and it s blazing fast. Let s see how you can set this up with Postgres.Saving a JSONB DocumentPostgres has native support for binary JSON using a datatype called JSONB . It behaves just like JSON but you can t have duplicate keys. It s also super fast because you can index it in a variety of ways.Since we re going to store our data in a JSONB field, we can create a meta table in Postgres that will hold that data. All we need is a primary key, a timestamp and the field to hold the JSON:create table my_document_table( id serial primary key, doc jsonb not null, created_at timestamp not null default now());We can now save data to it using a query like this:insert into my_document_table(doc)values('{"name":"Burke Holland"}');And yuck. Why would anyone want to do something like this? Writing delimited JSON by hand is gross, let s be good programmers and wrap this in a function:const runner = require("./runner");//in pg/index.jsexports.saveDocument = async function(doc){ const sql = "insert into my_document_table (doc) values ($1)"; const res = await runner.one(sql, [doc]); return res;}This works really well, primarily because our Node driver (pg-promise) understands how to translate JavaScript objects into something Postgres can deal with. We just pass that in as an argument.But we can do better than this, don t you think?Sprinkling Some Magical AbstractionOne of the cool things about using a NoSQL system is that you can create a document table on the fly. We can do that easily with Postgres but we just need to tweak our saveDocument function a bit. In fact we need to tweak a lot of things.Let s be good programmers and create a brand new file called jsonb.js inside our pg directory, right next to our runner.js file. The first thing we ll do is to create a way to save any document and, if we get an error about a table not existing, we ll create it on the fly!exports.save = async function(tableName, doc){ const sql = `insert into ${tableName} (doc) values ($1) returning *`; try{ const newDoc = await runner.one(sql, [doc]); doc.id = newDoc.id; return doc; }catch(err){ if(err.message.indexOf("does not exist") > 0){ //create the table on the fly await this.createDocTable(tableName); return this.save(tableName,doc); } }}exports.createDocTable = async function(tableName){ await runner.query(` create table ${tableName}( id serial primary key, doc jsonb not null, created_at timestamp not null default now() )`); await runner.query(` create index idx_json_${tableName} on ${tableName} USING GIN (doc jsonb_path_ops) `);}We have two groovy functions that we can use to save a document to Postgres with the sweetness of a typical NoSQL, friction-free experience. A few things to note about this code: We re catching a specific error when a table doesn t exist in the database. There s probably a better way to do that, so feel free to play around. If there s an error, we re creating the table and then calling the save function one more time. The createDocTable function also pops an index on the table which uses jsonb_path_ops. That argument tells Postgres to index every key in the document. This might not be what you want, but indexing is a good thing for smaller documents. We re using a fun clause at the end of our insert SQL statement, specifically returning * which will return the entire, newly-created record, which we can then pass on to our calling code.Let s see if it works!//index.js of our projectdocs.save("customers", {name: "Mavis", email: "mavis@test.com"}) .then(console.log) .catch(console.err) .finally(pg.shutDown);Well look at that would ya! It works a treat.But what about updates and deletes? Deleting a document is a simple SQL statement:exports.delete = async function(id) { const sql = `delete from ${tableName} where id=$1`; await runner.execute(sql, [id]); return true;};You can decide what to return from here if you want, I m just returning true. Updating is a different matter, however.Updating an existing JSONB documentOne of the problems with JSONB and Postgres in the past (< 9.5) was that in order to update a document you had to wholesale update it - a partial update wasn t possible. With Postgres 9.5 that changed with the jsonb_set method, which requires a key and a JSONB element.So, if we wanted to change Mavis s email address, we could use this SQL statement:update customers set doc = jsonb_set(doc, '{"email"}', '"mavis@example.com"')where id = 1; That syntax is weird, don t you think? I do. It s just not very intuitive as you need to pass an array literal to define the key and a string value as the new value.To me it s simpler to just concatenate a new value and do a wholesale save. It s nice to know that a partial update is possible if you need it, but overall I ve never had a problem just running a complete update like this:exports.modify = async function(tableName, id = 0, update = {}) { if (!tableName) return; const sql = `update customers SET doc = (doc || $1) where id = $2 returning *; `; const res = await runner.one(sql, [update, id]); return res;};The || operator that you see there is the JSONB concatenation operator which will update an existing key in a document or add one if it s not there. Give it a shot! See if it updates as you expect.Querying a JSONB document by IDThis is the nice thing about using a relational system like Postgres: querying by id is just a simple SQL statement. Let s create a new function for our jsonb module called get, which will return a document by ID:exports.get = async function(tableName, id=0){ const sql = `select * from ${tableName} where id=$1`; const record = await runner.one(sql, [id]); const doc = record.doc; doc.id = record.id; return doc;}Simple enough! You ll notice that i m adding the id of the row in Postgres to the document itself. I could drop that into the document itself, if I wanted, but it s simple enough to tack it on as you see. In fact, I think I d like to ensure the created_at timestamp is on too, so let s formalize this with some transformations:const transformRecord = function(record){ if(record){ const doc = record.doc; doc.createdAt = record.created_at; doc.id = record.id; return doc; }else{ return null; }}const transformSet = function(res){ if(res === null || res === []) return res; const out = []; for(let record of res){ const doc = transformRecord(record); out.push(doc) } return out;}This will take the raw record from Postgres and turn it into something a bit more usable.Querying a document using criteriaWe can pull data out of our database using an id, but we need another way to query if we re going to use this properly.You can query documents in Postgres using a special operator: @>. There are other operators, but this is the one we ll need for 1) querying specific keys and 2) making sure we use an index. There are all kinds of operators and functions for JSONB within Postgres and you can read more about them here.To query a document for a given key, you can do something like this:select * from customerswhere doc @> '{"name":"Burke Holland"}';This query is simply for documents where the key/value {name:"Burke Holland"} exists. That critieria is simply JSON, which means we can pass that right through to our driver and behold:exports.find = async function(tableName, criteria){ const sql = `select * from ${tableName} where doc @> $1`; const record = await runner.query(sql, [criteria]); return transformSet(record);}Let s run this and see if it works:docs.find("customers", {email: "mavis@test.com"}) .then(console.log) .catch(console.err) .finally(pg.shutDown);Hey that s pretty swell! You don t need to use dedicated JSON operators to query a JSONB document in Postgres. If you re comfortable with SQL, you can just execute a regular old query and it works just fine:select * from customerswhere (doc ->> 'name') ilike 'Mav%'Here, we re pulling the name key from the document using the JSON text selector (->>), and then doing a fuzzy comparison using ilike (case-insensitive comparison). This works pretty well but it can t use the index we setup and that might make your DBA mad.That doesn t mean you can t index it - you can!create index idx_customer_name on customers((doc ->> 'name'));Works just like any other index!Play around, have some fun I made a gist out of all of this if you want to goof around. There are things to add, like updates/partial updates, and I encourage you to play and have a good time.If you re wondering, however, if someone, somewhere, might have baked this stuff into a toolset indeed! They did Are there any ORM-like tools in it? What do you recommend?So here s the thing: if you re coming to this post from a Java/C#/Enterprise-y background, the ORM tools in the Node world are going to look well a bit different. I don t know the reason why and I could pontificate about Node in the enterprise or how Node s moduling system pushes the idea of isolation but let s just skip all of that OK?The bottom line is this: you can do data access with Node, but if you re looking for an industrial strength thing to rival Entity Framework you might be dissapointed. With that said - let s have a look My favorite: MassiveJSI am 100% completely biased when it comes to MassiveJS because well I created it along with my friend Karl Seguin back in 2011 or so. The idea was to build a simple data access tool that would help you avoid writing too much SQL. It morphed into something much, much fun.With version 2 I devoted Massive to Postgres completely and was joined by the current owner of the project, Dian Fay. I can t say enough good things about Dian - she s amazing at every level and has turned this little project into something quite rad. Devoting Massive 100% to Postgres freed us up to do all kinds of cool things - including one of the things I love most: document storage.The code you read above was inspired by the work we did with JSONB and Massive. You can have a fully-functioning document storage solution that kicks MongoDB in the face in terms of speed, fuzzy searches, full-text indexing, ACID guarantees and a whole lot more. Massive gives you the same, simple document API and frictionless experience you get with Mongo with a much better database engine underneath.To work with Massive, you create an instance of your database which reads in all of your tables and then allows you to query them as if they were properties (the examples below are taken from the documentation):const massive = require('massive');const db = await massive({ host: 'localhost', port: 5432, database: 'appdb', user: 'appuser', password: 'apppwd', ssl: false, poolSize: 10});//save will update or insert based on the presence of an//ID fieldlet test = await db.tests.save({ version: 1, name: 'homepage'});// retrieve active tests 21-30const tests = await db.tests.find({is_active: true}, { offset: 20, limit: 10});Working with documents looks much the same as the relational stuff above, but it s stored as JSON:const report = await db.saveDoc('reports', { title: 'Week 12 Throughput', lines: [{ name: '1 East', numbers: [5, 4, 6, 6, 4] }, { name: '2 East', numbers: [4, 4, 4, 3, 7] }]});Finally, the thing I love most about the project is what Dian has done with the documentation (linked above). She goes into detail about every aspect of the tool - even how to use it with popular web frameworks.SequelizeOne of the more popular data access tools - let s call it a full on ORM - is Sequelize. This tool is a traditional ORM in every sense in that it allows you create classes and save them to multiple different storage engines, including Postgres, MySQL/MariaDB SQLite and SQL Server. It s kind of not an ORM though because there is no mapping (the M ) that you can do aside from a direct 1:1, ActiveRecord style. For that, you can project what you need using map and I ll just leave that discussion right there.If you ve used ActiveRecord (Rails or the pattern itself) before then you ll probably feel really comfortable with Sequelize. I used it once on a project and found its use straightforward and simple to understand. Getting started was also straightforward, as with any ORM, and the only question is how well an ActiveRecord pattern fits your project s needs both now and into the future. That s for you to decide and this is where I hit the architectural eject button again (even though I did once before which didn t seem to work).Let s have a look at some of the examples that come from the documentation.Connecting is straightforward:const Sequelize = require('sequelize');const sequelize = new Sequelize('postgres://user:pass@example.com:5432/dbname');Declaring a model in Sequelize is matter of creating a class and extending from Sequelize.Model or using a built-in definition method. I prefer the latter:const User = sequelize.define('user', { // attributes firstName: { type: Sequelize.STRING, allowNull: false }, lastName: { type: Sequelize.STRING // allowNull defaults to true }}, { // options});Sequelize is capable of using this model definition and generating, or sychronizing your database just like Django s ORM does. That s really helpful in the early days of your project or if you just hate migrations as much as I do.Sequelize is an outstanding data tool that allows you to work with your database in a seamless way. It has powerful queries and can handle some pretty intense filtering:Project.findOne({ where: { name: 'a project', [Op.not]: [ { id: [1,2,3] }, { array: { [Op.contains]: [3,4,5] } } ] }}); If you ve worked with Rails and ActiveRecord Sequelize should feel familiar when it comes to associations, hooks and scopes:class User extends Model { }User.init({ name: Sequelize.STRING, email: Sequelize.STRING}, { hooks: { beforeValidate: (user, options) => { user.mood = 'happy'; }, afterValidate: (user, options) => { user.username = 'Toni'; } }, sequelize, modelName: 'user'});class Project extends Model { }Project.init({name: Sequelize.STRING}, { scopes: { deleted: { where: { deleted: true } }, sequelize, modelName: 'project' }});User.hasOne(Project);And there you have it. The documentation for Sequelize is very complete as well, with examples and SQL translations so you know what query will be produced for every call.But what about ?There are so many tools out there that can help you with Node and data access and I m sure I ve left a few off, so feel free to add your favorite in the comments. Please be sure it works with Postgres AND please be sure to indicate why you like it!Postgres is neat and all but how do I deploy my database?Great question! That will have to be a topic for Part 3, unfortunately as this post is quite long and I have a lot of ideas. We ll go simple and low fidelity with a simple docker container push, and then look at some of the hosted, industrial strength solutions out there - including Azure s Managed Postgres offering!
Just yesterday I was talking to a friend about Postgres (not uncommon) and he said something that I found shocking: I can t even with Postgres, I know JACK SQUATThis person calls themself my friend too! I just don t even know what s real anymore.So, Friendo is a Node person who enjoys using a document database. Can t blame him - it s easy to setup, easy to run and you don t need to stress out about SQL and relational theory. That said, there are benefits to wrapping structure and rules around your data - it is the lifeblood of your business after all.If you re like Friendo and you want to start from the very beginning with Postgres, read on! I ll use his questions to me for the rest of this post. He has a lot of questions, so I m going to break this up into parts: Part 1 (this post) is for people who ve never thought about a database before, let alone set one up and run a query Part 2 (next post) will be for Node people wondering what/why/how they could work with PostgresI encourage you to play along if you re curious. If you re having fun and want to do more, I wrote a really fun book about Postgres and the data from the Cassini mission (which you ll see below) that you re welcome to check out too!Where is Postgres? How do I get it and run it?The easiest possible thing you can do is to run a docker image, which you can do by executing:docker run -p 5432:5432 postgres:12.1That will download and run a Postgres image, exposing the default Postgres port of 5432.If you re not a Docker person and are on a Mac, you can also head over to postgresapp.com where you can download a free executable app.How do I manage it with a tool?Tooling for Postgres is both abundant and wanting. There is no clear cut answer to this question other than to offer the following options for a given context.Just playing around: Mac If you re on a Mac go get yourself a free copy of Postico. It s easy and you can quickly connect and start playing.Just playing around: Windows (and Mac)There s the free Azure Data Studio which uses the same interface as VS Code. There are extensions and all kinds of goodies you can download if you want as well.To hook up to Postgres, make sure you grab the Postgres extension. You can install it right from the IDE by clicking on the square thingies in the bottom left of the left-most pane.Something substantial and you re willing to pay for it (Windows and Mac) My go-to tool for working with Postgres is Navicat. It s a bit on the spendy side but you can do all kinds of cool things, including reports, charting, import/export, data modeling and more. I love this thing.Don t know what to choose? Just download Azure Data Studio and let s get to work!Our first login Let s connect to our new shiny Postgres server. Open up Azure Data Studio and make sure you have the Postgres extension installed. You ll know if you do because you ll see the option to connect to PostgreSQL in the connection dialog:The server name is localhost and the Docker image comes with the login preset - postgres as the user name and postgres as the password.We ll go with the default database and, finally, name our connection Local Docker . Click Connect and you re good to go.Our first database Most GUI tools have some way of creating a database right through the UI. Azure Data Studio doesn t (for Postgres at least) but that s OK, we ll create one for ourselves.If you ve connected already, you might be wondering what, exactly, am I connected to ? Good question Friendo! You re connected to the default database, postgres :This is the admin playground, where you can do DBA stuff and feel rad. We re going to use our connection to this database to create another one, where we re going to drop some data. To do that, we need to write a new query. Click that button that says New Query :In the new query window add the following:create database cassini;Now hit F5 to run the query. You should see a success message like so:If you see a syntax error, check your SQL code and make sure there are no errors. You ll also notice that nothing changed in the left information pane - there s no cassini database! What gives!Ease up Friendo! Just right click on the Databases folder and refresh - you should see your new database. Once you see, double-click it and in we go!Our first table Our database is going to hold some fun information from the Cassini Mission, the probe that we sent to Saturn back in 1997. All of the data generated by the project is public domain, and it s pretty fun to use that data rather then some silly blog posts don t ya think?There s a whole lot of data you can download, but let s keep things reasonable and go with the Master Plan - the dates, times and descriptions of everything Cassini did during it s 20 year mission to Saturn. I trimmed it just a bit to bring the file size down, so if you want to play along you can download the CSV from here.We ll load this gorgeous data in just one second. We have to create a table for it first! Let s do that now by opening a new query window in Azure Data Explorer (which I hope you remember how to do). Make sure you re connected to the cassini database, and then enter the following SQL:create table master_plan( date text, team text, target text, title text, description text);This command will, as you might be able to guess, create a table called master_plan . A few things to note: Postgres likes things in lower case and will do it for you unless you force it to do otherwise, which we won t. We don t have a primary key defined, this is intentional and you ll see why in a second. There are a number of ways to store strings in Postgres, but the simplest is text, without a length description. This is counterintuitive for people coming from other databases who think this will take up space. It won t, Postgres is much smarter than that. Why are we storing a field called date as text? For a very good reason which I ll go over in just a minute.OK, run this and we should have a table. Let s load some data!How do I load data into it?We re going to load data directly from a CSV, which Postgres can do using the COPY command. For this to work properly, however, we need to be sure of a few things: We need to have the absolute path to the CSV file. The structure of the file needs to match the structure of our table. The data types need to match, in terms of format, the data types of our table.That last bit is the toughest part. CSV (and spreadsheets in general) tend to be a minefield of poorly chewed data-droppings, mostly because spreadsheet programs suck at enforcing data rules.We have two ways to get around this: suffer the pain and correct the data when we import it or make sure all the import columns in our database table are **text**. The latter is the easiest because correcting the data using database queries tends to be easier than editing a CSV file, so that s what we ll do. Also: i_t s a good idea not to edit the source of an import._Right - let s get to it! If you re running Docker you ll need to copy the master_plan CSV file into your running container. I put my file in my home directory on my host. If you ve done the same, you can use this command to copy the file into your container:docker cp ~/master_plan.csv [CONTAINER ID]:master_plan.csvOnce it s there, you can execute the COPY command to push data into the master_plan table:COPY master_planFROM '/master_plan.csv'WITH DELIMITER ',' HEADER CSV;This command will grab the CSV file from our container s root directory (as that s where we copied it) and pop the data in positionally into our table. We just have to be sure that the columns align, which they do!The last line specifies our delimiter (which is a comma) and that there are column headers. The final bit tells Postgres this is a CSV file.Let s make sure the data is there and looks right. Right-click on the table and select Select top 1000 rows and you should see something like this:Yay data! Before we do anything else, let s add a primary key so I don t freak out:alter table master_planadd id serial primary key;Great! Now we re ready to connect from Node.How do I connect to it from Node?Let s keep this as simple as possible, for now. Start by creating a directory for the code we re about to write and then initializing a Node project. Feel free to use Yarn or NPM or whatever!Open up a terminal and:mkdir pg_democd pg_demonpm init -ynpm install pg-promisetouch index.jsThese commands should work in Powershell on Windows just fine.We ll be using the promise-based Postgres driver from Vitaly Tomalev called pg-promise, one of my favorites. The default Node driver for Postgres works with standard callbacks, and we want promises! There are also a few enhancements that Vitaly thew in which are quite nice, but I ll leave that for you to explore.The first step is to require the library and connect:const pgp = require('pg-promise')({});const db = pgp("postgres://postgres:postgres@localhost/cassini");I m connecting to Postgres using a URL-based connection string that has the format:postgres://user:password@server/db_nameSince we re using Docker, our default username and password is postgres . You can, of course, change that as needed.Once we ve set up the connection, let s execute a query using some very simple SQL:const query = async () => { const res = await db.any("select * from master_plan limit 10"); return res;}Because pg-promise is promise-based, I can use the async and await keywords to run a simple query. db.any will return a list of results and all I need to do is to pass in a SQL string, as you see i did. I made sure to limit the results to 10 because I don t want all 60,000 records bounding back at me.To execute the query, I call the method and handle the returned promise. I ll pop the result out to the console:query().then(res => { console.log(res)}).catch(err => { console.error(err)}).finally(() => { db.$pool.end()})The last line in the finally block closes off the default connection pool, which isn t required but the Node process won t terminate unless you do (you ll have to ctrl-c to stop it otherwise).You can run the file using node index.js from the terminal, and you should see something like this:Glorious data! Notice it all comes back in lovely, formatted JSON, just as we like.There s a lot more we can do, but this post is already quite long and I think Friendo might have a few more questions for me. I ll see if he does and I ll follow up next time!
I ve written about Full Text Indexing in PostgreSQL before but I was a bit more focused on speed and general use. Today I want to focus on something a lot more useful: relevance.If you want to play along and have some fun, the SQL for what I m about to do can be downloaded from here (11K zipped SQL file).Make Those Results Meaningful!The data I m working with is from NDC Sydney and is a list of speakers, their talks, keywords and the time/day of the talk. Simple stuff, but it does present an interesting question:How would you implement a full text index for this body of data?Turning the speaker s name, the title and the keywords into a blob of text and then indexing it will work, but it s simply not enough if we expect the results to actually mean something to our users. This is where things get complicated - which also means they get FUN so strap yourself in, let s get TEXTUAL.What Are You Looking For Anyway?There s no way we can do this without fully understanding what our users want out of our search functionality, so let s come up with some scenarios: Jane has been to 5 conferences already this year and just wants to know what s new with DevOps and Azure. She takes out her phone and, while walking, enters the words as they come to her mind: devops azure This is Kunjan s first conference and he doesn t know where to start - all he knows is that Heather Downing is speaking and he really wants to be sure he can see her talks so he searches exactly on that: Heather Downing . Nenne is excited about Blazor and knows the dev team is here, showing it off. She can t remember their names - just the project name - so she searches on that: blazor .The ProblemsWe have three difference kinds of searches here: The first is contextual, which means that Jane knows the topics she s interested in and wants to throw a list of words at our search, hoping for a ranked match. The second is specific, Kunjan wants to see a specific speaker s talk - that means we need to be sure that we can return a hit on exact part of a first or last name. Finally, Nenne s query is relative, which means she knows a term (the project name) and wants to see results relative to it.If we re to show these people something meaningful we ll need to come up with a strategy for building our full text index. Thankfully, Postgres has the tools we need.Let s take a quick second to (quickly) understand what goes on behind the scenes as our full text index is being created - it s really helpful when trying to debug things. Then we ll move on and create solutions for each of these problems.Behind the ScenesA full text index is actually a data type in Postgres called tsvector. It s a weird name, but what it does is pretty simple:select to_tsvector('english', 'nothing too tricky here'); to_tsvector --------------------- 'noth':1 'tricki':3(1 row)I m using Postgres s built-in to_tsvector function to tokenize my string of words into lexemes. What s a lexeme you ask? Hey, good question! A lexeme is a unit of lexical meaning that underlies a set of words that are related through inflection. It is a basic abstract unit of meaning, a unit of morphological analysis in linguistics that roughly corresponds to a set of forms taken by a single root word. WikipediaYou can apply various stems to a lexeme to create a set of different words. So noth in this case could be stemmed to nothing or nothingness . The integers that you see in the results above are the position within the text body. The first word is nothing so we have a 1 and tricky is the third word. This comes in handy later on when we want to know positional information (which we will!).Finally, you ll notice that too and here have been stripped. These are stop words (or noise words) and aren t indexed.But how does all of this tokenization happen?Postgres ships with a number of dictionaries that parse a given blob of text. If you want to raise the hood on this, you can run the ts_parse function to see what happens:select * from ts_parse('default', 'nothing too tricky here'); tokid | token -------+--------- 1 | nothing 12 | 1 | too 12 | 1 | tricky 12 | 1 | here(7 rows)The first argument to this function is the search configuration, which I m setting to default as I don t want to break anything. What I get back is a list of tokens and their id. 1, for instance, is an ascii word and 12 is blank space.You can see a lot more information if you use the ts_debug function, which is designed to help you if you re fiddling with the search config stuff:select * from ts_debug('nothing too tricky here'); alias | description | token | dictionaries | dictionary | lexemes -----------+-----------------+---------+----------------+--------------+---------- asciiword | Word, all ASCII | nothing | {english_stem} | english_stem | {noth} blank | Space symbols | | {} | | asciiword | Word, all ASCII | too | {english_stem} | english_stem | {} blank | Space symbols | | {} | | asciiword | Word, all ASCII | tricky | {english_stem} | english_stem | {tricki} blank | Space symbols | | {} | | asciiword | Word, all ASCII | here | {english_stem} | english_stem | {}(7 rows)I think this is interesting, but it s also academic for our needs. Let s get back on track and setup our search index.Task 1: No Stems for Names!Before we index anything, we need to consider what the thing is and also what it is not. A little vague, but let s start with names.Names are specific. While one could make the argument that some names might be more common in a given language, I think we can agree that s problematic. In that sense, tokenizing a name as if its English words doesn t make sense.Heather s last name is Downing , which could refer to what she might do to a glass of cold water after a long run or what she did to enemy planes during the war. Neither of those is the case, yet that s exactly how the tokenizer will treat her name.That s how full text queries work in Postgres: matching lexemes. The to_tsquery function you see here simply tokenized the term given to it, applying the rules of the dictionary you specify, which is english in my case:select to_tsquery('english', 'downing'); to_tsquery ------------ 'down'(1 row)We can fix this problem by using a different dictionary. This makes perfect sense since we re don t consider names part of a language! For this, Postgres gives us the simple dictionary:select to_tsquery('simple', 'downing'); to_tsquery ------------ 'downing'(1 row)The simple dictionary doesn t create a lexeme from the token given to it - it just returns the raw word (unless it s a noise word) lower-cased. This will work perfect for indexing our names:select to_tsvector('simple', body ->> 'name') from ndc limit 5; to_tsvector ------------------------- 'alex':1 'mackey':2 'adam':1 'furmanek':2 'kristy':1 'sachse':2 'downing':2 'heather':1 'passos':2 'thiago':1(5 rows)Perfect. We ll use this when building our overall index in just a minute.Applying Weights to KeywordsProper tagging is difficult to do. I m not going to spend time on how to do that - let s just assume that you and your app have a cool set of tags you re happy with. Now comes the big question: are those tags words?On one hand, it seems like the answer should be yes. Tags are contextual and tend to be things like database , career , azure etc. But what about the tags virtual-machines or virtual-network ?select to_tsvector('virtual-network'); to_tsvector --------------------------------------------- 'network':3 'virtual':2 'virtual-network':1(1 row)select to_tsvector('virtual-machines'); to_tsvector ------------------------------------------- 'machin':3 'virtual':2 'virtual-machin':1(1 row)Both of these tags will match on the term virtual , no matter what it s followed by. That means we ll get a hit on virtual-conference , virtual-meeting , and virtually everything since the word virtually will turn into the lexeme virtual . That might be OK, it really depends on your tagging strategy. For me, I ll be using the simple dictionary once again because tags are specific, simple terms for this conference.OK - now let s address the weighting. We can apply weights to our tags by using the setweight function in Postgres:select setweight(to_tsvector('simple', (body ->> 'tags')),'A') from ndc limit 5; 'cloud':1A 'fun':2A 'microsoft':2A 'net':1A 'agile':1A 'design':2A 'devops':8A 'methodology':9A 'people':3A 'skills':6A 'soft':5A 'soft-skills':4A 'ux':7A 'agile':1A 'ethics':6A 'people':2A 'skills':5A 'soft':4A 'soft-skills':3A 'cloud':2A 'database':3A 'microsoft':4A 'net':1A(5 rows)Weighting is simply a matter of applying a letter suffix to the positional integer. As you can see, cloud:1A has replaced cloud:1. That will be used when we run our query later on.Oh yeah - something neat to note here is that Postgres is smart enough to take a JSONB array value and turn it into a text array for us, on the fly, and then apply indexing :).Weighting ConsiderationsAt this point we need to figure out relative weighting for the information we ll be searching. If you have only text blob your indexing, then it doesn t make sense to apply weighting - but that s rarely the case in an online app.The thing you need to consider when weighting is what hits are valued more than others? Weighting doesn t affect which records will be recognized, it simply lifts those records to the top depending on how you weighted them (A through G).I m going to make the choice that if someone enters a tag, that should be raised to the top. Next would be someone s name (though you could argue it should be the other way around) and finally whatever was found in the title:Given this, we can build our entire search index with something like:select setweight(to_tsvector('english', (body ->> 'tags')), 'A') || ' ' || setweight(to_tsvector('simple', (body ->> 'name')), 'B') || ' ' || setweight(to_tsvector('english', (body ->> 'title')), 'C')::tsvectoras search_indexfrom ndc limit 5;Note: you ll notice that I m using the || operator to concatenate the values together, including a space between them. If you don t do this you ll get words jammed together and crappy results.We ve applied the top weight, A, totags and B to name with title coming in last with C. This is just relative ranking, which means that terms found in the keywords are ranked higher than the title, for instance. That will help Jane find her DevOps at Azure talks.Kunjan will find Heather s talk as we re not stemming - so he won t get confused with bad results. And finally Nenne will easily find her Blazor talk as the name appears in the title.The only tricky part to this is if a speaker s name appears in the title of a talk - so Juana Blazor might throw off the result - but there s simply no way we can know which our user might want. We can, however, make the decision that hits in the names should be counted higher! Which is what we did.Let s add a generated column to our ndc table and test it out!alter table ndcadd search tsvectorgenerated always as ( ( setweight(to_tsvector('english', (body ->> 'tags')), 'A') || ' ' || setweight(to_tsvector('simple', (body ->> 'name')), 'B') || ' ' || setweight(to_tsvector('english', (body ->> 'title')), 'C'))::tsvector) stored;This is a new feature in Postgres 12 - generated columns. They re virtual columns that are (for now) stored on disk and completely managed by Postgres. Whenever our record is updated our search index will be too!We re now ready to start querying.Constructing a Proper QueryLet s start with the 3rd example first: blazor , which in Nenne s query. This isn t a keyword match because it s not part of our tags, but it is a project title which will, hopefully, appear in a title somewhere. In that case, we can run the following query just fine:select body ->> 'title' as title, body ->> 'name' as namefrom ndcwhere search @@ to_tsquery('english', 'blazor');-[ RECORD 1 ]--------------------------------title | Blazor, a new framework for browser-based .NET appsname | Steve Sanderson-[ RECORD 2 ]--------------------------------title | Blazor in more depthname | Steve Sanderson Ryan NowakGroovy! We re using our tsvector field, search, and running a comparison with @@ to the to_tsquery function. We get back some results and we can see that we have Blazor in the title. Great!At that point Nenne remembers that Steve Sanderson is one of her favorite speakers, so she decides to search both blazor and Sanderson :ERROR: syntax error in tsquery: "blazor sanderson"Oh no! What happened? The short answer is that to_tsquery expects a single word as an argument, which seems really weird at first! I mean this is a full text search dude! WTF?The problem is that Postgres doesn t know what you want to do with more than one word. Is it just a collection of words? Or is it a phrase which has some structure to it. The query blazor Sanderson doesn t mean anything to you or me, but Jane s query Azure DevOps could be considered a phrase, where the term Azure needs to come before DevOps .For that, we can modify our query using plainto_tsquery:select body ->> 'title' as title, body ->> 'name' as namefrom ndcwhere search @@ plainto_tsquery('english', 'blazor sanderson');-[ RECORD 1 ]------------------------------------title | Blazor, a new framework for browser-based .NET appsname | Steve Sanderson-[ RECORD 2 ]------------------------------------title | Blazor in more depthname | Steve Sanderson Ryan NowakYes! boom! That works really well. The function plainto_tsquery takes a plain text blob and treats it just like a bunch of words. In fact you can see exactly what it does by asking Postgres: select plainto_tsquery('blazor sanderson'); plainto_tsquery ------------------------ 'blazor' & 'sanderson'(1 row)The text gets parsed into individual words, tokenized and turned into lexemes and then placed into a logical AND condition. In other words: both blazor and sanderson must be in the search index.But what about Jane s query? She wants to know what s knew with Azure DevOps:select body ->> 'title' as title, body ->> 'name' as namefrom ndcwhere search @@ plainto_tsquery('english', 'azure devops');-[ RECORD 1 ]-----------------------title | Static Sites, Dynamic microservices, & Azure: How we built Microsoft Docs and Learnname | Dan Fernandez-[ RECORD 2 ]-----------------------title | DataDevOps for the Modern Data Warehouse on Microsoft Azurename | Lace LofrancoHmmm. Well that sort of worked in that we have two talks about Azure that also have the term devops in the title however there s nothing there about the Azure DevOps product. One way that we can fix this is to send in a phrase rather than a blob of words using phraseto_tsquery:select body ->> 'title' as title, body ->> 'name' as namefrom ndcwhere search @@ phraseto_tsquery('english', 'azure devops');(0 rows)This is a bit more accurate: there aren t any talks specifically about Azure DevOps. The phraseto_tsquery function leverages the positional argument that s stored with tsvector, making sure that one word will appear before another. You can see this if you ask Postgres what s going on:select phraseto_tsquery('azure devops'); phraseto_tsquery -------------------- 'azur' <-> 'devop'The words are tokenized into lexemes once again, but this time there s the positional <-> operator, indicating that azure must appear before devops in the string (the inclusive AND is implied).OK, let s make sure that Kunjan can find Heather s talk and then we ll be done! I ll use the regular plainto_tsquery here since I want to be sure we match properly on name:select body ->> 'title' as title, body ->> 'name' as namefrom ndcwhere search @@ plainto_tsquery('Downing');(0 rows)Good grief - no results!?!?! What the heck?Using the Right DictionaryThe problem we re having is matching dictionaries. When we use to_tsquery or, in this case, plainto_tsquery, the words we pass in will be tokenized according to some kind of dictionary. The default has to do with the location of the server and the default configuration - but it s typically set to the language of the region of the server.In the case of our name tokens, however, we used the simple dictionary which means that lexemes didn t get generated and therefore will cause a match problem.To see what I mean, take a look at our plainto_tsquery for Downing using the default dictionary (which is english in my case):select plainto_tsquery('Downing'); plainto_tsquery ----------------- 'down'(1 row)We re trying to match a literal term to a lexeme, so of course we re going to have problems. We can get over this by using the simple dictionary with plainto_tsquery:select body ->> 'title' as title, body ->> 'name' as name from ndc where search @@ plainto_tsquery('simple','Downing');-[ RECORD 1 ]------------------------------title | Keynote: The Care and Feeding of Software Engineersname | Heather DowningMuch better! But this raises another question How Do You Query With Two Dictionaries?I want to be able to query with both the English and simple dictionaries - but how can I do that and still get reasonable results?The simplest way to do this with an OR query:select body ->> 'name' as name, body ->> 'title' as title, body ->> 'tags' as tagsfrom ndc where search @@ plainto_tsquery('english', 'heather keynote') ORsearch @@ plainto_tsquery('simple', 'heather keynote');-[ RECORD 1 ]-------------------------------name | Heather Downingtitle | Keynote: The Care and Feeding of Software Engineerstags | ["agile", "people", "soft-skills", "ethics"]It s a bit on the verbose side, but as you can see we were able to find Heather s keynote just fine. Note also that I m using plainto_tsquery here because I m expecting a word salad, I can change that, however, in the case of names.We re almost done! Now let s sort our results in a meaningful way.Ranking The Result Using Our WeightingWeighting doesn t do much good unless we can apply it, so for that we ll need to make sure there s some form of score we can use when querying. For that, we have Yet Another Postgres Function: ts_rank.There are actually two of these functions. The first is ts_rank which is a score based on word frequency and the second is ts_rank_cd, which is based on frequency but also coverage distance - which is basically how far words are apart in a query. For us, ts_rank will do fine.To use these functions you have to pass in the tsvector value as well as the tsquery:select ts_rank(search,plainto_tsquery('english', 'devops')) + ts_rank(search,plainto_tsquery('simple', 'devops')) as rank, body ->> 'name' as name, body ->> 'title' as title, body ->> 'tags' as tagsfrom ndcwhere search @@ plainto_tsquery('english', 'devops') OR search @@ plainto_tsquery('simple', 'devops') order by rank desclimit 5;-[ RECORD 1 ]-------------------------------------------------------------------------------------rank | 0.9074664name | Ashley Nobletitle | Trials and Tribulations of a DevOps transformation in a large Companytags | ["devops"]-[ RECORD 2 ]-------------------------------------------------------------------------------------rank | 0.6383234name | Damian Bradytitle | Pragmatic DevOps - How and Whytags | ["devops"]-[ RECORD 3 ]-------------------------------------------------------------------------------------rank | 0.6079271name | Enrico Campidogliotitle | Understanding Git Behind the Command Linetags | ["t", "devops"]-[ RECORD 4 ]-------------------------------------------------------------------------------------rank | 0.6079271name | Pooja BhaumikNick Randolphtitle | Using Flutter to develop cloud enabled mobile applicationstags | ["cross-pl", "mobile", "devops"]-[ RECORD 5 ]-------------------------------------------------------------------------------------rank | 0.6079271name | Klee Thomastitle | Picking up the pieces - A look at how to run post incident reviewstags | ["agile", "devops"]Update: the original post had the `query bits aliased but, as mentioned by Oleg in the comments, this isn t a very efficient query as it would require nested loops and joins. The query you see here is a bit more verbose, but a lot more efficient.A few things to note about this code: I m adding the ts_rank results together because each tsquery is going to have its own score. I ll get into this in a bit. I limited the results, because there are a lot.The OR query works great and we re able to query by names, tags and titles and we re almost done - but as you can see the scoring is weird.Postgres does some voodoo math behind the scenes and honestly it doesn t really matter what those scores are all about - what does matter is that some are scored higher than others and we need to make sure our scoring scheme works as we want.Looking at the top 2 it s easy to see it does: they have the term devops as tags as well as the title. This is a classic SEO rule for the web, and we should feel good about our search strategy, don t you think? I guess it can be abused, however, if we pretend it s 1998 and load our title and speaker s name with keywords:select ts_rank(search,plainto_tsquery('english', 'devops')) + ts_rank(search,plainto_tsquery('simple', 'devops')) as rank, body ->> 'name' as name, body ->> 'title' as title, body ->> 'tags' as tagsfrom ndcwhere search @@ plainto_tsquery('english', 'devops') OR search @@ plainto_tsquery('simple', 'devops') order by rank desclimit 5;-[ RECORD 1 ]-------------------------------------------------------------------------------------rank | 0.9074664name | Ashley DevOps Nobletitle | DevOps Trials and DevOps Tribulations of a DevOps transformation in a large DevOps Companytags | ["devops"]-[ RECORD 2 ]-------------------------------------------------------------------------------------rank | 0.6383234name | Damian Bradytitle | Pragmatic DevOps - How and Whytags | ["devops"]-[ RECORD 3 ]-------------------------------------------------------------------------------------rank | 0.6079271name | Enrico Campidogliotitle | Understanding Git Behind the Command Linetags | ["t", "devops"]-[ RECORD 4 ]-------------------------------------------------------------------------------------rank | 0.6079271name | Pooja BhaumikNick Randolphtitle | Using Flutter to develop cloud enabled mobile applicationstags | ["cross-pl", "mobile", "devops"]-[ RECORD 5 ]-------------------------------------------------------------------------------------rank | 0.6079271name | Klee Thomastitle | Picking up the pieces - A look at how to run post incident reviewstags | ["agile", "devops"]OK it s not perfect, but it s much better than indexing a blob of text because: We can recognize speaker names We re weighting tag recognition over title We re weighting tags and names over the loose text of a titleI think for most web applications this will work really well!Flexing Postgres 12Trying to decide between to_tsquery, plainto_tsquery and phraseto_tsquery can be difficult. It was kind of straightforward in our case - we re not searching on any phrases really.The Postgres team decided to be helpful in this regard, especially when it comes to web applications, so they created websearch_to_tsquery. It basically treats the input as if it were entered into a Google search. To be dead honest I have no idea what s happening under the covers here, but it s supposed to be a bit more intelligent than plainto_tsquery and a little less strict than phraseto_tsquery.I ve played with it a few times and haven t noticed much of a difference - it is worth noting however!Phew! Long post - hope it was helpful!
The PostgreSQL team has been jamming out updates on a regular basis, adding some amazing features that I hope to go into over time but one of these features made me extremely excited! Generated columns: A generated column is a special column that is always computed from other columns. Thus, it is for columns what a view is for tables. Yay!What this means is that you can have a managed meta column that will be created and updated whenever data changes in the other columns.Too bad Dee didn t know about this when she was working with the Cassini data! Setting up those search columns would have been much easier!An Example: A Fuzzy Search for a Document TableLet s say you have a table where you store JSONB documents. For this example, I ll store conference talks in a table I ll call NDC , since I was just there and did just this:create table ndc( id serial primary key, body jsonb not null, created_at timestamptz not null default now(), updated_at timestamptz not null default now());Here s an example of a talk - a real one I scraped from the NDC site, which happens to be Heather Downing s amazing keynote:{ "title": "Keynote: The Care and Feeding of Software Engineers", "name": "Heather Downing", "location": "Room 1", "link": "https://ndcsydney.com/talk/keynote-the-care-and-feeding-of-software-engineers/", "tags": ["agile", "people", "soft-skills", "ethics"], "startTime": { "hour": 9, "minutes": 0 }, "endTime": { "hour": 10, "minutes": 0 }, "day": "wednesday"}This wad of JSON will get stored happily in our new table s body field but querying it might be a pain. For instance - I might remember that Heather s talk is the Keynote, but it s a long title so remembering the whole thing is a bummer. I could query like this:select * from ndc where body ->> 'title' ilike 'Key%';Aside from being a bit of an eyesore (the body ->> 'title' stuff is a bit ugly), the ilike 'Key%' has to run a full table scan, loading up the entire JSON blob just to make the comparison. Not a huge deal for smaller tables, but as a table grows this query will start sucking resources.We can fix this easily using the new GENERATED syntax when creating our table:alter table ndcadd column title text generated always as (body ->> 'text');Run this and the generated column is created and then populated as well! Check it:title is now a relational columnBut wait, there s more. If we tried to run our search query with the fuzzy match on title we d still have to do a full table scan. Generated columns actually store the data as opposed to computing it at query time, which means we can create index idx_title on ndc(title);BAM! What used to require a few triggers and an occassionally pissed off DBA is now handled by PostgreSQL.Also - just to be sure this is clear - we could also have declared this in the orginal definition if we wanted:create table ndc( id serial primary key, body jsonb not null, title text generated always as (body ->> 'title') stored, created_at timestamptz not null default now(), updated_at timestamptz not null default now());create index idx_title on ndc(title);Into the Weeds: The Search FieldAdding a full text search index would seem to be the obvious use of GENERATED don t you think? I decided to wait on that because, for now, it s not exactly straightforward.If all I wanted to do was to search on the title of a talk then we re in business sort of:alter table ndcadd search tsvectorgenerated always as (to_tsvector('english', body ->> 'title')) stored;This works really well, as you can see:But it took me about 2 hours (seriously) to figure this out as I kept getting a really annoying error, which I ll go into in a minute:ERROR: generation expression is not immutableLong story short, if you don t add the english language definition to the ts_vector function than things will fail. The expressions that you use to define a generated column must be immutable, as the error says, but understanding which functions are and are not can be a bit of a slog.Deeper Into the Weeds: Using ConcatLet s keep going and break things shall we? We ve got a lot of lovely textual information in our JSON dump, including tags and name. This is where we earn our keep as solid PostgreSQL brats because we know, ohhh do we know that a blanket full text indexing that tokenizes everything evenly is pure crap :).We ll want to be sure to weight the tags and maybe suppress the tokenization of names - I ll get to that in a later post - right now I just want to take the next step, which is to add other fields to our search column. All we have at the moment is the title - let s add name:alter table ndc drop search;alter table ndcadd search tsvectorgenerated always as ( to_tsvector('english', concat((body ->> 'name'), ' ', (body ->> 'title')) )) stored;I formatted this so it reads better - hopefully it s clear what I m trying to do? I m using the concat function to, well, concatenate the name with a blank space and then a title. I need that blank space in there otherwise the name and title will be rammed together making it useless.ERROR: generation expression is not immutableCrap! What? This is a concatenation!?!?! How is this not immutable? Turns out it s the concat function that s causing the problem, and I m not sure why (if you know please leave me a comment). This, however, does work:alter table ndc drop search;alter table ndcadd search tsvectorgenerated always as ( to_tsvector('english', (body ->> 'name') || ' ' || (body ->> 'title') )) stored;That, my friends, is super hideous - but it gets the job done. I ll get more into full text indexes in a later post as I ve had some really good fun with them recently.SummaryI ve had a lot of fun goofing around with the generated bit. If you re wondering, the actual update goes off right after the before trigger would normally go off - so if you do have a before trigger on your table, you can use whatever values are generated there.You also might be wondering about the stored keyword you see here? Right now it s the only option: the generated bits are stored on disk next to your data. In future releases you ll be able to specify virtual for just in time computed bits but not now.
Audiobook Review: Fall; Or, Dodge in Hell by Neal Stephenson
The problem with any book by Neal Stephenson is that the person foolish enough to try and review it has to start somewhere. A foothold on the story, its arc, the social relevance and a bunch of other blah blah blah. To use a common Stephenson affectation: with this story, like the rest of his stories, there isn t such.Because to get to that somewhere involves introspecting the story and coming up with a Gestaldtish summation of WTF just happened. Stephenson s stories are not short and are brutally thick with cause and effect to the point where you just kind of lose your train of thought, as I m doing.This story is f***ing overwhelming, and life changing. It s also a bunch of other hyperbolic superlatives that I ll just wrap up with the typical feeling I have after finishing any of Mr. Stephenson s books (whether first or fourth reading): I ll never think the same way again.The CommitmentEvery Stephenson book I ve read requires commitment on the reader s part. Some people dig the challenge, others just aren t into the mental toil. I m both, if I m honest. It took me 3 efforts to get through Anathem, 5 to get through Cryptonomicon and I m still trying to get through Seveneves. It s a rite of passage.My brother threw Fall across the room, and he s not a light reader (Pynchon is his favorite writer). Of the 8 friends I have who are trying to read this book, 3 have given up, 2 reported it to be slog and 3 utterly loved it. I was each of these people.I don t think it s possible to find two people who agree on this book, which, to me, means it s worth trying and that is my first point: take this book on. Accept the challenge. It s worth it.A Digital AfterlifeThis book is about the digital afterlife. If you re a Black Mirror fan and are thinking about San Junipero don t - it s not that at all. The premise, however, is sort of the same.The book picks up a few years after REAMDE and features many of the same characters (C+, Zula, Richard Dodge Forthrast, etc) and it also weaves in characters from other Stephenson stories - most notably Cryptonomicon and the Baroque Cycle. I thought that was a nice touch.The story starts off with energy and pulls you in quickly, starting with the death of Dodge and, to me, a masterful social media hoax involving a nuclear bomb going off over Moab, Utah.This is Stephenson at his best. Breathtaking depth and vision with absurd assertions backed up by relentless science. The nuclear bomb turns out to be a hoax, but the entire world buys into it because video footage (using actors from a fake movie) dropped on social media played into people s fears. Meanwhile, a DDoS attack in Moab itself cut it off from the internet - so for a few days people believed that was simply gone.That and the following chapters were the most fun to read and, consequently, the ones I forgot about first. They deal with the destruction of the internet as we know it, replaced by a block chain analogue that embraces accountability. I had to relisten to a few chapters (out of choice) because Stephenson goes into such rich detail and I absolutely loved the idea of an internet apocalypse , something I don t think I ve read about before.From there things get weird, fast.Shaping a Digital AfterlifeI don t think it s possible to spoil a Stephenson book because the fun is in the journey, not the story. If you remember anything from this review it s that: it s a long, slow, dense journey. If you want to be totally surprised by it, stop reading here. I don t plan to give away major plot points but just in case.Stephenson doesn t just spring the idea of a digital afterlife on you, he shapes a world - the only world - in which it s possible to have such a thing. He addresses the computational needs (using quantum machines of course) and the energy and cooling required. The center of the this phase of the book - the buildup if you will - is a legal back and forth regarding Dodge s will. In it he specified exact instructions on what to do with his body when he dies because he wants it preserved for the time and place when it s possible to load his connectome (his digital self) into a computer. One of the funny things about it, in retrospect, was its slowness, the lack of any dramatic Moment When It Had Happened. It was a little bit like the world s adoption of the Internet, which had started with a few nerds and within decades become so ubiquitous that no person under thirty could really grasp what life had been like before you could Google everything. Neal Stephenson, Fall; Or, Dodge in HellThis is what I love about Stephenson: you get completely lost in his overly active brain. He comes at you with so many ideas that you have to pause the audio (or put the book down) to let the myriad whacko nuttiness settle and form some type of recognizable concept.The idea of mapping a human brain by creating what is, essentially, a 3-dimensional graph of neural connections (the connectome ) is brilliant and bullshit at the same time. I think. Maybe not. I really have no idea! Stephenson s tech-literary Judo comes at you so fast you find yourself on the ground before you know what s happening, disbelieving all of it until you realize that hey wow you re on the ground and he s standing over you laughing trying to help you back up.What just happened?When Sophia, Dodge s niece, goes to work for Corporation 9592 she is tasked with understanding the DB, or Dodge s Brain . All they have is his connectome in the form of binary files on disk, which she loads into a distributed quantum network and just turns it on.I loved this cowboy coder technique of booting up the first digital brain because it s exactly what would have happened. It takes everyone by surprise in the book too, which I thought was wonderful and I found myself giggling as Sophia just kind of shrugged as if while the scientists around her pitched fits. Nice touch :).Behold: EgdodSkipping ahead: they managed to boot Dodge s brain and his consciousness came online. I thought more work could have been done here but Stephenson chose to go down a path that I thought was simultaneously interesting, odd, and also obvious. He chose the to form the digital afterlife into some kind of fantasy realm.Egdod is Dodge s avatar from T rain, the MMO at the center of REAMDE (a great book btw). When Dodge awakens his brain is forced to make sense of chaos, and eventually does like something straight out of the book of Genesis. Egdod ( Dodge spelled backwards) is definitely god-like, sprouting wings and forming the land with not much recollection of who he is or what s happened to him.It was hard for me to make this transition. I loved the Black Mirror world that Stephenson was building and the stories happening therein with characters that I was attached to from his previous books. Transitioning to a fantasy story seemed clumsy to me.And Stephenson being Stephenson, he clubs you over the head with it until you submit. Egdod is joined by other souls over time and they start to branch out into other forms, including a hive of souls that mass together and threaten Egdod s superiority. At one point, a soul named Spring figures out how to create life from the chaos in the form of a bee.Everything about this part of the book: the tone, speech patterns, and the prose, are all reminiscent of a fairly standard fantasy novel full of gods, dense forests and magic. Eventually you get used to it, and then Back to The Real WorldStephenson transitions between these world abruptly, as if trying to reinforce the idea that you are, in fact, reading two books. This made sense to me because a digital afterlife really is a different plane of existence (as it s described later in the book). I mean: if you ended up in San Junipero would you want to spend time thinking about your previous existence?I don t want to make light of this point (and Stephenson certainly didn t). The characters in both realms (real and digital) don t interact at all. The only way people in the real world can understand what s happening in Bitworld (as they call it) is through the use of a viewer which tracks signal associations between processes. It s overwhelming to think about how that might work, and the characters in the book get used to it in the same way Cipher got used to spotting people in the Matrix by looking at the code from the image translators.Over decades, the viewer becomes so good that people begin watching it as a source of entertainment, leading to one of my favorite lines from the book: The living stayed home, haunting the world of the dead like ghosts. Neal Stephenson, Fall or Dodge in HellThe dead in Bitworld, however, have some idea about where they came from but its assigned a mythical quality. The only one to have a full understanding of his past is Dodge, who is defeated in battle by his nemesis in the book and, at the same time, gains full understanding of both worlds. Interestingly, nothing more comes of this, which I thought was weird.The Fading of the Real WorldDecades go by in the real world while in Bitworld, time flows according to processing power. Another fine touch by Stephenson: addressing the time slip between Bitworld and the real world. Time goes by as normal in Bitworld, but to the people watching it at home it will slow to a crawl or speed up to the point where days go by in minutes.To handle the processing power, servers are put in orbit and capture solar radiation for energy and thermal radiation is mirrored out into space to avoid overheating the planet. I like the way Stephenson handles this. There s a lot of money to be made off of Bitworld so we will invent accordingly.Interestingly, Stephenson doesn t spend much time talking about too many other real world details. That s been done to death in other stories - he s far more interested in how Bitworld matures over time. It s in this part of the book that I think most people get lost or flat out give up.It s easy to see why: the entire narrative is thrown completely on its head. Bitworld matures into a full-blown fantasy realm straight out of Lord of the Rings. People can do different forms of magic, fantastical beings (lightning bears being one of my favorite) inhabit treacherous landscapes and everything reads like a Tolkein novel.It s all very confusing. Until you consider one thing: how else could it truly be?It seems that when we humans have a chance to invent a different world we reach for the fantastic. One of my favorite games is Witcher 3 and it could easily take place in Bitworld.Wait it does take place in Bitworld, except Geralt isn t formed from the soul of a dead person. At least that we know of.This was the slow realization as I finished Fall: Stephenson captured the only way this could possibly work out. Left to their own design, dead souls will build and create what they find interesting or, put another way, what inspiration wells from their deep memory. This, to me, was a stroke of genius and, once again, Stephenson has spun my brain.The Cloud Within the Silver LiningWith a book this long it s impossible to not become irritated by an author s affectations. This is especially true with audiobooks. I began to sense when Stephenson was tired of rolling out a certain plot point - fatiguing him to the point where the word various would crop up more than normal and he would flip into passive voice: Various nobility were arrayed around the table while goblets of wine were filled bawdy jokes told. Example of Tired StephensonYou can t blame Stephenson for succumbing to fatigue with this book. It s HUGE and it s DENSE, but sentences like this should be tackled (in my opinion) by an editor and reworked into something less throwaway.It s Almost Too HugeThe subjects in Fall are legion. Stephenson talks about social change, the death of fact and the internet as well as its utopian blockchain-powered replacement - this could have been a satisfying book on its own! There s the question of what a soul is and the religious probings of an afterlife, each of which get their turn albeit a shallow one.It s unlike Stephenson to arm wave any detail, but I felt in this book he had to in order to finish it. I don t mind that he didn t drill into everything, but some pruning could have been helpful. For instance: server farms in space sound fun, but how exactly do you network those to avoid the obvious latency issues? Cosmic rays and radiation, meteorites and natural disasters taking souls out of existence This might sound demanding, but Stephenson is known for rounding out these kinds of plot elements. I mean servers in space? Yeah! I can barely manage to get an app deployed to AWS but deploying a virtual soul? Yikes! And what about viruses Finally: you want to spend time with beloved characters from his old books that Stephenson reintroduces in this story. The Shaftoes make a tangential entrance and I immediately started thinking about Bobby, one of my favorite characters from Cryptonomicon. I kept thinking that they would play a larger role but nope.Some characters do (I won t spoil that who it is) and it s weird. I never fully understood why this character was there but that s OK I loved them before and I still love them now :).A Missed Opportunity MaybeI brought up Anathem before, primarily because it s one of my favorite books from Stephenson. Such a bizarre story in an alternate reality that dealt with the very idea of what reality and consciousness is.It would fit perfectly within Fall. In fact I was convinced that the ending would feature Bitworld giving itself the name of Arbre with the main characters founding the Concents to avoid some kind of collapse. I think that would have been fun but maybe a little over the top (as if that s a problem here).The Audio PerformanceThe book is read by the same narrator as REAMDE: Malcom Hillgartner. He is extraordinary in this book, his attempts at an Australian accent aside. I do voice over stuff for videos and I can tell you that keeping the energy and pace as he does is a miracle. I couldn t help myself in trying to figure out where the daily breaks were - a narrator s voice will typically sound crisper from one chapter to the next - but I couldn t do it with Mr. Hillgartner.Well that s it! I really enjoyed this book but it takes dedication. I had to make sure that I didn t listen in small bits (10 mins or less) as I d lose the plot quickly. Instead, I made time in the evening to sit for an hour or so, and also during lunch breaks, which made all the difference.
When I started writing The Imposter s Handbook, this was the question that was in my head from the start: what the f*** is Big O and why should I care? I remember giving myself a few weeks to jump in and figure it out but, fortunately, I found that it was pretty straightforward after putting a few smaller concepts together.UPDATE 3-26-2019: This post hit Hacker News and a few other opinion boards and it caused quite a few arguments. I responded, but I figured I d clarify a few things quickly right here.First: Big O is conceptual. Many people want to qualify the efficiency of an algorithm based on the number of inputs. One commenter said if I have a list with 1 item it can t be O(n) because there s only 1 item so it s O(1) . This is an understandable approach, but Big O is a technical adjective, it s not a benchmarking system. It s simply using math to describe the efficiency of what you ve created.Second: Big O is worst-case, always. That means that even if you thing you re looking for is the very first thing in the set, Big O doesn t care, a loop-bases find is still considered O(n). That s because Big O is just a descriptive way of thinking about the code you ve written, not the inputs expected.If you disagree with me, feel free to drop me an email using rob at this domain.- I was recently at NDC London and gave a talk which had Big O in it. I asked a few of the attendees and other speakers about the subject - wondering if it would be a useful thing to talk about or if it was too academic and theoretical. The replies I got were a bit mixed, but there was, overwhelmingly, a common refrain.Big O? I HATE Interview Questions! This One Time This was the primary response from asking roughy 15 people what they thought. The common sentiment was along these lines: I get paid to write code, not white papers. Big O has nothing to do with my day job.I promise you: I am not overstating this for effect. People don t like to be put on the spot in interviews. They don t want to be made to feel stupid. All of this is understandable but the unfortunate side effect is that a very useful concept (Big O) gets kicked to the curb.I m glad I took the time to learn Big O because I find myself thinking about it fairly often. If you ve always wondered about Big O but found the descriptions a bit too academic, I ve put together a bit of a Common Person s Big O Cheat Sheet, along with how you might use Big O in your every day work.Rather than base this on arrays and simplified nonsense, I ll share with you a situation that I was in just a month ago: choosing the right data structure in Redis. If you ve never used Redis before, it s a very basic key-value store that works in-memory and can optionally persist your data to disk.When you work in a relational database like PostgreSQL, MySQL, or SQL Server you get a single data structure: the table. Yes, there are data types, sure, but your data is stored in a row separated by columns, which is a data structure.Redis gives you a bit more flexibility. You get to choose the data structure that fits your programming need the best. There are a bunch of them, but the ones I find myself using most often are: String. With this structure you store a string value (which could be JSON) with a single key. Set. A Set in Redis is a bunch of unordered, unique string values. Sorted Set. Just like a Set, but sorted. List. Non-unique string values sorted by order of insertion. These things operate like both stacks and queues. Hash. A set of string values identified by sub keys . You can think of this as a JSON object with values being only strings.Why are we talking about Redis when this post is about Big O? Because Redis and Big O go hand in hand. To choose the right data structure for your needs, you need to dig you some Big O (whether you know it s Big O or not).Finding Something in a Shopping CartLet s say you re tasked with storing Shopping Cart data in Redis. Your team has decided that an in-memory system would work well because it s fast and it doesn t matter if cart data is lost if the server blows up.The question is: how do you store this information? Here s what s required: Finding the cart quickly by key CRUD operations on each item within the cart Finding an item in the cart quickly Iterating over each item in the cartBelieve it or not, you re thinking in Big O right now and you might not even know it. I used the words quickly and iterate above, which may or may not mean something to you in a technical sense. The thing I was trying to convey by using the word quickly is that I want to get to the cart (or an item within it) directly, without having to jump through a lot of hoops.Even that description is really arm-wavy, isn t it? We can dispose of the arm-waving by thinking about things in terms of operations per input. How many operations does my code need to perform to get to a single cart from the set of all carts in Redis?Only One Operation: O(1)The cool thing about Redis is that it s a key-value store. To find something, you just need to know its key. You don t have to run a loop or do some complex find routine it s just right there for you.When something requires only one operation we can say that directly: my code for finding a shopping cart is on the order of 1 operation. If we want to be Big O about it, we can say it s order 1, or O(1) . It doesn t matter how many carts are in our Redis database either! We have a key and we can go right to it.A more precise way to think about this is to use the term constant time . It doesn t matter how many rows of data are in our database (or, correspondingly, how many inputs to our algorithm) - the algorithm will run in constant time which doesn t change.What about the items in the cart itself?Looping Over a Set: O(n)We know that our cart will need to store 0 to n items. I m using n here because I don t know how many items that will be - it varies per customer.I can use any of Redis s data structures for this: I can store a JSON blob in a String, identified by a unique cart key I can store items in a Set or Sorted Set, with each item being a bit of JSON that represents a CartItem I can store things in a List in the same way as a set I can store things in a Hash, with each item having a unique sub keyWhen it comes to items in the cart, we need to be able to do CRUD stuff but we also need to be able to find an item in the cart as quickly as possible . If we use a String (serializing it into JSON first), a Set or a List we ll need to loop over every item in a cart in order to find the one we re looking for.Rather than saying need to loop over every item , we can think about things in terms of operations again: if I use a Set or a List or a String I ll need to have one operation for every n items in my cart. We can also say that this is order n , or just O(n).You can spot O(n) operations easily by simply looking for loops in your code. This is my rule of thumb: if there s a loop, it s O(n) .Looping Within a Loop: O(n^2)Let s say we decided to keep things simple and deal with problems as they arise so we chose a Set, allowing us to dump unique blobs of JSON data that we can loop over if we need to. Unfortunately for us, this caused some issues:Duplication in our SetBy changing the quantity in our CartItem we have made our JSON string unique, causing duplication. We need to remove these duplications now, otherwise our customers won t be happy.Simple enough to do: we just loop over the items within a cart, and then loop over the items one more time (skipping the current loop index) to see if there s a match on sku. This is a classic brute force algorithm for deduping an array. That s a lot of words to describe this nested loop algorithm and we can do better if we use Big O.Thinking in terms of operations, we have n operations per n items in our cart. That s n * n operations, which we can shorthand to order n squared or O(n^2). Put another way: deduping an array is an O(n^2) operation, which isn t terribly efficient.As I said before, I like to think of these things in terms of loops. My rule of thumb here is that if I have to use a loop within a loop, that s O(n^2). Another rule of thumb is that the term brute force almost always denotes an O(n^2) algorithm that uses some kind of nested loop.Indexing a Database Table and O(log n).If you ve ever worked on a larger project with a DBA, you ve probably been barked at for querying a table without utilizing an index (a fuzzy search, for instance). Have you ever wondered what the deal is? I have. I was that DBA doing the barking!Here s the thing: tables tend to grow over time. Let s say that our commerce site is selling independent digital films and our catalog is constantly growing. We might have a table called film filled with ridiculous test data that we want to query based on title. Unfortunately, we don t have an index just yet and our query is beginning to slow down. We decide to ask PostgreSQL what s going on using EXPLAIN and ANALYZE:Our database is doing what s called a Sequential Scan . In SQL Server land this is called a Full Table Scan and it basically means that Postgres has to loop over every row, comparing the title to our query argument.In other words: a Sequential Scan is a loop over every item which means it s O(n), where n represents the number of rows in our table. As our table grows, the efficiency of this algorithm goes down linearly.It s easy to improve the performance here by adding an index:Now we re using an Index Scan, which is, I suppose, much faster. But how much? And how does it work?Under the covers, most databases use a version of an algorithm called binary search - I made a video about this and other algorithms which you can watch right here if you want. For binary search to work properly, you have to sort the list of things you re working with. That s exactly what Postgres does when you first create the index:Now that the index is sorted, Postgres can find the title we re looking for by systematically splitting this list in half until there s only one row left, which will be the one we want.This is much better than looping over every row (which we know is O(n)), but how many operations do we have here? For this we can use logarithms:We re continually splitting things in half in a sorted set until we arrive at the thing we want. We can describe this with an inverted binary tree, as you see above. We start with 8 values, split, and are left with 4, which we split again to get 2, then finally 1.This is an inverse squaring operation as we re going from 2^3 (8) down to 2^2 (4) down to 2^1 (2) and finally 2^0 (1). Inverse squaring operations are called logarithms. That means that we can now describe the operations of our database index as being logarithmic . We should also specify logarithmic of what to which we can answer we don t know, so we ll say it s n , also known as O(log n).This kind of algorithm is called divide and conquer and when you see those words, you know immediately that you re talking about a log n algorithm. And So What?Here s why you care about turning something that s O(n) into O(log n) and the best part is that it s not really arguable because it s math (I was told that means you re always right :trollface:).Let s say we have 1000 records in our film table. To find Academy Dinosaur our database will need to do 1000 operations (comparing the title in each row). But how many will it do if we use an index? Let s use a calculator and find out, shall we? I need to find the log base 2 (because of the binary split) of 1000:10 operations with our indexOnly Ten! Ten splits of 1000 records to find what we want in our database. That s a performance gain of a few orders of magnitude, and it s a lot more convincing to tell someone that as opposed to it s a lot faster .The best part here is that we can keep using this calculator to find out how many operations will be needed if we have a million records (it s 20) or a billion (it s 30). That kind of scaling as our inputs goes up is the stuff of DBA dreams.Bonus Question: What s The Big O of a Primary Key Lookup?It s tempting to think that if I have a primary key and I know the value of that key that I should be able to simply go right to it. Is that what happens? Think about it for a second and while you re thinking let s talk about Redis a bit more.A major selling point of Redis (or any key-value system really) is that you can do a lot of stuff with O(1) time complexity. That s what we re measuring when we talk about Big O time complexity, or long something takes given the inputs to an algorithm you re working with. There s also space complexity which has to do with the resources your algorithm needs, but I ll save that for another post.Redis is a key-value store, which means that if you have a key, you have an O(1) operation. For our Shopping Cart above, if I use a Hash I ll have a key for the cart as well as a key for each item in the cart, which is huge in terms of performance or I should say optimal time complexity . We can access any item in a cart without a single loop, which makes things fast. Super fast.OK, back to the question regarding primary key queries: are they O(1)? Nope:This surprised me!Indexes in most database systems tend to use a variation of binary search, and primary key indexes are no different. That said, there are plenty of optimizations that databases use under the covers to make these queries extremely fast.I should also note that some databases, like Postgres, offer you different types of indexes. For instance you can use a Hash Index with Postgres that will give you O(1) access to a given record. There is a lot going on behind the scenes, however, to the point where there s a pretty good debate about whether they re actually faster. I ll side step this discussion and you can go read more for yourself.There You Have ItI find myself thinking about things in terms of Big O a lot. The cart example, above, happened to me just over a month ago and I needed to make sure that I was flexing the power of Redis as much as possible.I don t want to turn this into a Redis commercial, but I will say that it (and systems like it) have a lot to offer when you start thinking about things in terms of time complexity, which you should! It s not premature optimization to think about Big O upfront, it s programming and I don t mean to sound snotty about that! If you can clip an O(n) operation down to O(log n) then you should, don t you think?So, one last time: Plucking an item from a list using an index or a key: O(1) Looping over a set of n items: O(n) A nested loop over n items: O(n^2) A divide and conquer algorithm: O(log n)Hope this helps!
I started my career on the Microsoft stack building forms and websites using drag and drop tools. Over time that became a punchline, which is unfortunate because honestly, the productivity was insane.In 2008 I made the jump to the Linux world and I was completely disoriented. Everything was a damn text file. Yes, you could use a Mac or Ubuntu or whatever Unix Desktop du Jour seemed fun but there simply was no getting around the need to know your commands, which I did.Just like learning SQL, learning your text commands makes you more efficient. I promise you that I m not about to flip the l33t bit. I m not here to convince anyone of anything what I do want to do is to share how I embraced the command line with respect to PostgreSQL and was damn happy for it.Friendly vs. FriendlyI ve been meaning to write this post for years but it was this post from Ryan Booz that made me fire up the editor. Ryan is a SQL Server DBA that is writing a series on how he s learning PostgreSQL after a 15 year (!) career as a SQL Server DBA. I can t imagine that change is an easy one.Basically, Ryan has concerns (which I understand): In the case of PostgreSQL, I ve quickly come to the conclusion that bad tooling is one of the main reasons the uptake is so much more difficult and convoluted coming from the SQL Server community. Even the devs I m currently working with that have no specific affinity for databases at all recognize that PostgreSQL just feels like more of a black box then the limited experience they had previously with SQL Server.I can t say he s wrong on this, although I will say the term bad is a bit subjective.Let me get right to it: jumping from SQL Server to PostgreSQL is much more than changing a tool. PostgreSQL was built on Unix, with Unix in mind as the platform of choice, and typically runs best when it s sitting on some type of Unix box. The Unix world has a pretty specific idiom for how to go about things and it certainly isn t visual!As someone who learned to code visually, I had to learn what each icon meant and the visual cues for what happens where. I came to understand property pains, the lines under the text of a button that described shortcuts, and the idiomatic layout of each form. Executing a command meant pressing a button.In the Unix world you write out that command. The check boxes and dialogs are replaced by option flags and arguments. You install the tools you need and then look for the binaries that help you do a thing, then you interrogate them for help, typically using a --helpcommand (or just plain help).The same is true for PostgreSQL. This is the thing that I think was stumping Ryan. He s searching for visual tooling in a world that embraces a completely different idiom. It s like going to Paris and disliking it (and France) because the barbecue is horrible.Let s walk through some common PostgreSQL DBA stuff to show what I mean.Your Best Friend: psqlWhen you encounter a new Unix tool for the first time (and yes, I m labeling PostgreSQL that) you figure out the binaries for that tool. PostgreSQL has a number of them that you ll want to get to know, including pg_dump and pg_restoreamong others. The one we want right now is psql, the interactive terminal for PostgreSQL that gets installed along with the server. Let s open it and ask it what the hell is going on:Hello psqlI m using Mac s Terminal app but you can use any shell you like, including Powershell and the Windows command line. I would strongly urge you, however, to crack open a Linux VM or Docker to get the flavor of working with PostgreSQL. You can, indeed, find barbecue in Paris but it might help to explore the local cuisine.Reading through this list of options and commands will take some patience the first time but it s worth it! At the top of the list are the common options, like using -cfor running a command and -dfor the database to run the command in. There s a key statement, however, at the very end of this help screen:psql: it s interactive!The psql tool is interactive! This will help us - so let s log in to a database and have a look around. But which database? We ll create one by running this on the command line:createdb redfourThe createdbbinary has one job, in typically Unix fashion: create a database on the local server. It has a counterpart binary as well: dropdb. How do I know this? It s one of those things you get used to as you work with Unix systems - figure out the binaries and what they do.How do you do that? We know about one binary so far, psql, so let s figure out where that lives and hopefully the other binaries live there too:Using which and ls to tell us moreThis is one of those things you learn over time: asking Unix which version of a tool/binary/runtime it s using and where it s located. The result of that command is telling me that psql lives in the /Applications/.../bin directory, which is pretty standard for binary tools. I copy/paste the result to the ls command (list contents) and we can see the binary tools at our disposal.Yay. Let s log in and play around.What the Hell Is Happening?Right now I m at an interactive terminal within my database and have no idea what to do next. This is the major upside of visual tooling: you have cues that you can follow which inform you as to what s happening. It s the difference between Halo on the Xbox and an old school MUD it feels outdated and silly.Let s keep going and see if that s true. When we ran the --help command before, it told us to use \? to figure out commands within psql, so let s try that first:Hello sea of text crashing over me!There is so much to absorb here. All of these cryptic little commands do something but what they do, at first, will likely be opaque to you. This is Yet Another Patient Deep Breath point, because pretty soon we re about to light this shit on fire (in a good way). What you have, right here, is a lot of raw power right at your fingertips. It just takes a few hours to understand the cadence of these commands as well as their modifiers. I ll show you exactly what I mean in just a few minutes, for now let s ground ourselves.Scroll down (using down arrow or your mouse) to the Informational command section. This is your bread and butter - here you can see what s in your database at a quick glance. We can do that by using \d (press Q to get out of the text view of the help page):Our database is empty. Let s fix that by creating a quick table for our users:When you write a SQL command within psql you can wrap the lines. Notice that the prompt changes as well, telling you that you currently have an open paren. To finish the command add a semi-colon and we re done. Note: I m not going to get into SQL but it s really worth your time to learn.Now let s list out our relations again:Lovely. We have our table and the thing that handles the id generation for that table, called a sequence. Let s ask psql more about this table using \d users:The structure of our table is laid out in glorious ASCII, heavy with information and completely bereft of anything resembling prettiness. For visual people, this is a turn off as it s completely different than what they re used to (which I understand). For people used to working in a text-based idiom, this is heavenly.Why? It s the speed of the thing. Let s put a clock to the problem. One of your appdevs just did something ill-advised with their ORM and they think they might have broken the users table. You decide to investigate:psql redfour\dt usersWhen you re just starting out with PostgreSQL (and psql), you ll need to squeeze your memory a bit for the commands to inspect a table. After a while your fingers will be done typing before your next breath.This is the power you want as a DBA.At this point I could go off on all of the psql commands available to you, however I would encourage you to explore these for yourself and see what s possible, and how blazingly fast you can get it done. My coworker (ha ha so fun to say that now) Craig Kerstiens has written extensively on PostgreSQL, and this post is extremely helpful for people getting used to the command line aspect of it.I want to get into why this kind of thing matters.Text is a Helluva DrugIf I asked you to move data from one server to another using your favorite visual tool, how would you do it? If you do it often then the process would be a simple one and likely involve some right-clicking, traversing a menu, and kicking off a process in your tool of choice.In Unix land (and therefore Postgres land) it s a matter of remembering a few commands. But this is where it gets interesting because everything in Unix is a text file. Almost every task you can think of in Unix can be done using a text-based command. It would be like trying to find barbecue in Paris when every building is made of meat and the Seine is a river of hot coals.To show you what I mean, here s how you might pull your production database down to your local server:pg_dump postgres://user:password@server/db > redfour.sqlcreatedb redfourpsql redfour < redfour.sqlThe pg_dump binary has the singular task of turning a database into a SQL file. You can, of course, tweak and configure how this works and to find out all of the options you would use can you guess? pg_dump --help . So we re dumping the structure and data to a SQL file, creating our database and then pushing that SQL file as a command into our new database.This entire process will execute in < 5 seconds on a smaller sized database (~20mb). This is why we like text and text-based interfaces - SPEED!There s Always a WayAs you might be able to tell, I ve had this conversation more than a few times. Visuals are very important, to be sure! But they have their place when it comes to your daily workflow as a DBA. I would argue that double-clicking, right-clicking, and drag/drop are much slower than taking the time to memorize some common commands.One place that psql sucks, however, are visuals. Executing a query on a large table can look horrible:Yuckity YuckThis is DVD Rental sample database, running a select * from film; query. It looks like crap! The good news is that we should be able to fix this. Let s ask psql what s going on using \? :There are two things to notice here. The first is \x which allows for expanded output, or vertical alignment of data. That looks like this:Using expanded outputThe other thing you can do is to set HTML as the output using \H. This will execute the query, returning the results in HTML:This is interesting but I want this saved to a file. To do that, I can use \o (which you can see in the help menu) and specify which file:The file produced isn t terribly exciting, but it s somewhat useful:This is where we can embrace the texty nature of Unix and see what s possible if we start jamming binaries together with some core Unix commands, which are all based on text.Let s use psql to execute a query, but this time we ll format things using Bootstrap:echo "<link rel='stylesheet' href='https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css'>" > report.html && psql chinook -c "select * from film" -H >> report.html && open report.html OK it s certainly not crazy amazing but for a quick report it s not so bad. You can alter the SQL statement to output only the columns you want, and you could formalize the call using a bash function to make it all pretty.Yeah But It s Not Management Studio!Very true. You can t double-click a table and edit the rows, for instance, and there are no spiff icons. Altering data is done with INSERT and UPDATE commands, deleting is done with DELETE. This is something that you do have to get used to, for sure, and if this is a common task for you than you might want to focus on a tool that allows that (such as Postico, which is free).If there s one reason to use psql it s speed. I would also argue power but for now I ll go with speed as the primary reason. You re done before you know what happened and, if you have repetitive tasks, you can save your commands as text files to run as you need, when you need.Change isn t easy, but the people I know that have made the change use psql on a daily basis and absolutely love it. They also flip into a visual tool as needed. One thing they all agree on, however, is that they don t miss the visual stuff at all.
I m here in London at NDC on a lovely Saturday morning and I have the day to myself. I just made myself some coffee and I was recalling a conversation I had last night with some friends that are here about traveling internationally, what to pack, and the little tricks we use.I traveled for a year with my family (my wife, myself, and my two daughters aged 10 and 13) and we focused on being as minimal as possible. I learned some very interesting ways to be self-sufficient, which I ll share below in no particular order.Dealing With Jet LagI m lucky enough to speak (from time to time) at international conferences, and without fail the first topic that comes up with fellow travelers is sleep, or the lack of it. Everyone has their way of dealing with in fact I know a few people who don t get jet lag at all! but I haven t found anything that works as well as the biohack I used last year: melatonin.Melatonin, my saviorMelatonin is an over-the-counter, natural thing that you can find in most health stores or online. It s all natural and your body produces it in order to regulate your sleep cycle.When I came back from NDC London last year, I lived in Seattle and we hit a stretch of rain that lasted for a good 3 weeks. It s really dark this time of year in Seattle so my body simply would not adjust to the time change - there was no sun. I tried everything, and finally my wife bought me some melatonin (which she had been suggesting I try forever): Melatonin is a hormone that regulates sleep-wake cycles. This hormone is primarily produced by the pineal gland. As a medication, it is used for the short-term treatment of trouble sleeping such as from jet lag or shift work. From WikipediaI wish I would have listened to her sooner, this stuff is miraculous. I took it and that night slept like the big fat baby I am.I bought the spray kind that you see above at Whole Foods 3 days before I left, wondering if it would work with the 10 hour time difference I was about to face. We have since moved back to Honolulu and I d grown used to sunlight and the longer days I was kind of dreading the jet lag.I took some on the plane and slept for a solid 6 hours, which I never do on planes! I have been taking it every night since and (I still can t believe this) didn t feel any jet lag. Zero. None. CRAZY!I m sure this stuff effects people differently, but for me it was a wonder drug.Fewer Clothes, More LaundryI haven t checked luggage in years, reducing everything I need to smaller carryon-sized bags. Recently I managed to travel for 2 weeks with nothing but a simple backpack, which for some people is exceedingly easy, but for me it s kind of hard. My problem is a simple one: I really like shoes.I ll talk about shoes later, but I thought I would share with you how I cut my luggage size in half and used only a small backpack.First thing: get a good bag. Spend some money on this and don t be cheap about it.The Tom Bihn Synapse 25This is a Tom Bihn bag and I m a freak for these things. I hate to admit it, but I have five bags made by this company. Two are laptop bags, one backpack (this one above) and one carry on size (the Aeronaut) and the middle of the road sized Tri-Star. A good bag makes all the difference when you re traveling for a longer time.To make this work, you need to convince yourself that you need half as much as you think you do. This is the hardest part and I wish I had a trick for you but I don t. I deal with this every time I go anywhere.What has worked for me is the following: 4 pairs of underwear 3 pairs of socks 3 shirts 2 undershirts (Under Armour or the like) 2 - 3 pants (no jeans!) 1 pair of shoes Trial-sized or no toiletries 1 small bottle of Dr. Bronner s soap Laptop, charger, optional small iPad Drying line and camping water heaterAll of this will fit easily in the Synapse, the trick is making it fit into your mind! I can t really help with that part other than to say it s really nice to pack things up in 10 minutes and not carry around a giant bag!The key here is the undershirts. I like Under Armour because they re not cotton (have I said this already), feel great, and I wash them instead of my shirts. I m being purposely vague about what these shirts are because you pack what you need for your occasion. Dress shirts for work stuff, more casual if you need.To make this all work properly, we need to break down this list a bit. Let s start with the underwear.Let s Talk About UndiesThis tip works for whatever body you inhabit: good underwear and socks will make the trip exponentially better. I have a simple rule for traveling and do my best to stick to it: cotton is your enemy. I m not suggesting you run out to your local mountain performance store and stock up on technical gear! I simply mean there are some alternatives you can think about.I m a huge fan of Lululemon underwear. The men s briefs and women s undies are amazingly comfortable and super easy to clean. I bring two kinds with me: regular briefs and performance briefs (for running or playing basketball). The performance ones are wicking and are perfect for long flights. They re also great for walking miles and miles, which I love to do in cities like London.The best part is that they dry fast. This is why you don t want cotton! You can, literally, wash these things in the sink, roll them in a towel and sit on them for 5 minutes (rolled up) and they ll be dry enough to put on a walk out the door. No kidding! More on laundry later.The same thing goes for socks: avoid cotton, get some performance ones. Your feet will seriously thank you.Finally: pants. You can find stretchy cotton pants everywhere these days and they re great because they wrinkle less than full cotton. Blue jeans are the worst choice because they take forever to dry (which is horrible if you go to a rain climate), don t insulate well at all, don t wick worth a damn and tend to dress you down unless they re the expensive kind. They also take up a ton of space and weigh a lot.For this trip I brought two pairs of Vans chinos and they ve been perfect:Believe it or not this isn t me.Tip: Recycling Your Clothes is OKWhen we went to Europe for the year I knew we would be facing all kinds of variation in temperature and weather so I came prepared to recycle my clothes. There are a lot of second-hand stores across the world and they re easy to find - just Google for Good will store near me (or Oxfam, Salvation Army, etc).I often find myself caught out with clothes that are either too cold or too warm. There are bargain stores everywhere or, if you like shopping like I do, just get the thing you need and recycle something you brought with you.I try not to emotionally attach myself to my clothes. On our big trip I recycled 90% of what I brought with me.Hotel LaundryMost people I know have done hotel laundry: dumping some bath gel or shampoo into the hotel sink (or bath tub) and doing it all by hand. This works fine, but there s a better way! The laundry bag itself:Me, doing laundry in a bagMost hotels provide a laundry service and give you bags to put your clothes in that you want cleaned. This service is pretty expensive and you can easily spend $50 on a single load.Some hotels offer coin-operated laundry as well, but that s a pain and it also doesn t apply to us because we re not bringing enough clothes to justify that much energy/water consumption.A super-simple easy way to do laundry is to fill the laundry bag to half-full, add a few squirts of your Dr. Bronner s soap and shake. This will foam everything up and then you dump your clothes in.You can spin everything by hand or just roll the bag around the tub - but keep it in the tub! These bags can easily pop open!Here s another tip: turn the bag upside down while holding it semi-closed. Water will come out and create a siphon within the bag, squeezing out a lot of the water so you don t have to.Refill the bag 2 more times for the rinse and you re done.Simple Drying With Hangers and a LineOnce you re done with your laundry you ll be very happy that you don t have cotton. Drying synthetic fabrics is so much faster and simpler.For this I ll use the hangers in the closet, which usually have those clips on them, to dry things. If I have more clothes with me (and therefore more laundry) I ll bring a drying line, which you can find at any camping store. That s a bit more stuff, however, and recently I ve let the thing stay at home.Quick Dry With a TowelAs I mention above, if you need some undies quickly you can lay a towel on the floor, put your undies on one side, fold the towel over them and then roll it all up as tightly as you can. Sit on the rolled towel for 5 minutes and you ll be amazed at how dry everything is.Super Quick Dry With an Iron or Blow DryerBe very very careful with this one - if you bought those nice undies you can easily melt them! Make sure you set the iron appropriately.When I was in college I had a job as a waiter at a Mexican restaurant which made us wear these ridiculous tuxedo shirts. They had to be cleaned and pressed every day, and it was a huge pain. Until I learned how to shower wash them.I was always late, but I had enough time to take a quick shower and bring the shirt in with me. It turns out that shampoo is a pretty good laundry detergent so I would wash the thing quickly, wrap it in a towel to dry it, and then put it under an iron.Within 5 minutes the thing was almost perfectly dry. You can do the same with a blow dryer as well, which works great for socks.Coffee!I like good coffee but when I m traveling I have to let that idea go. Hotels (especially internationally) will usually offer you instant and if you want anything better you have to go downstairs and wait at breakfast. Even then, it tends to be crap.This triple sucks when you re jet lagged, awake with a caffeine headache at 3am and just want something right now. For this I turn to Starbucks VIA Instant my life saver.There are Starbucks in most airports around the world, including my home airport in Honolulu. Before I go I ll buy 2-3 (depending on how long I m gone) boxes of the stuff and put the little packets in the side pocket of my bag:Coffee in the morning in my roomMost hotels will give you a carafe of some kind for tea or instant coffee if you re in Europe, but in the US this often isn t the case so I bring this thingy with me on every trip:These can be hard to find, but I found mine at REI here in the US. Most camping stores have them and you can Google submersible water heater to find one near you.My wife laughed at me when I brought this thing on our big trip, telling me I was being a techy camper with goofy gadgets. When she was craving coffee at 4am in our hotel in Iceland that didn t have a water carafe she gave me a massive hug! That one moment (which has repeated 4 times in my life) will make this gadget a necessity for you.Most hotels will put coffee mugs in your room, but I don t like trusting that so I bring my special cup with me, which I suppose is a bit of a convenience but this is also where my socks/underwear are packed inside my bag:A Yeti in LondonI have yet to find a cup as versatile and useful as this here Yeti. I can put two packets of coffee in here and it stays delightfully hot on these cold London mornings. Later I can put a nice cold beer or some water that I ve lifted from the breakfast buffet.ShoesThis is my weakness. I love a good pair of shoes and I tend to lose focus when packing. I convince myself that I ll need a dress pair (for speaking or just going out) and a pair for walking/exercising. This is almost never the case, but I still give in anyway (as I did on this very trip).I gave in because I knew it was going to be cold and rainy and that I d probably soak my shoes more than once. I was right, but I could have done this better!If you get some comfortable, non-cotton performance socks then a simple pair of trainers or running shoes will dry incredibly fast if your feet get wet. I was thinking about this last night because I went out and bought myself some trainers! I left mine at home fearing the cold and rain - turns out walking 5 miles in a pair of Vans slip ons really really hurts.Some stylish women s trainers from AllbirdsYou just have to look around a bit and you should find some super comfortable performance trainers that you can use for a run or a nice dinner out.I found some New Balance on sale for $40 - an amazing deal - and have been wearing them ever since. I soaked them last night as I was walking around but I had my non-cotton (wicking) socks on so my feet were fine. I got home, put the blow dryer on my shoes for 5 minutes and they were passably dry.Fewer/Smaller ToiletriesMost grocery stores have a trial section where you can shrunken versions of most toiletries. Every time I travel I grab: A new toothbrush (which I leave behind) Toothpaste and deodorant A single razor Small thing of Advil (just in case) Visine (lubricating)I also bring along nail clippers because it really sucks to find out, after a 5 mile walk, that your toe nails are a bit long (sorry I know that s gross but hey, we re adults here).I leave most of this stuff behind, but if I have room and haven t used most of an item I ll bring it back with me.Have Some Tips for Me?I m not a super travel guru, but I have done my fair share and hacked together some fun little tricks. If you think I forgot something or have a question/comment/suggestion, feel free to leave in the comments below.Cheers!
Over the winter holiday break (on Christmas Eve, to be precise), Scott Hanselman and I released the next volume in the Imposter s Handbook series. It took us just over 18 months to put this thing together and I couldn t be happier with it.Shannon s Second Theorem, Illustrated via DeathstarFrom Logic to Boolean Algebra, Binary to CircuitsThere was so much content I had to ax from the first volume of The Imposter s Handbook because I ran out of time and space. For instance: I knew very, very little about binary operations and even less about encryption and I desperately wanted to change that. Unfortunately, it had to wait until I had the time.That time came over the last 18 months. Scott and I dove into things like binary addition and subtraction, logic gates, and Boolean Algebra.Something I could never remember: XORIt was fun to dive into these subjects but I was not expecting what came next. Even writing this now I m struggling to come up with a way to accurately capture the singular importance of one person s work. He s been compared to Einstein, Edison, Newton all rolled into one.Claude Shannon and Information TheoryClaude Shannon created Information Theory with a single paper written in 1948. In it, Shannon detailed a way to describe information digitally, that is with 1s and 0s. He detailed how this information might be transmitted in a virtually lossless way and, as if that wasn t enough, he also described just how much information that digital signal could contain.We take this kind of thing for granted today, but keep in mind that in Shannon s day, the only way to communicate over great distances was with telegraph or telephone!But wait, there s more!Prior to inventing Information Theory, Shannon wrote a master s thesis that many people regard as the most important master s thesis ever written.The Creation of Digital CircuitsIn the late 1930s, Claude Shannon was working on his master s degree at MIT. He was also working with Vannevar Bush on the Differential Analyzer, a room-sized mechanical computer that would calculate differential equations typically for military use.The Twin Cambridge University Differential Analyzer (Public Domain)This machine had to be programmed by hand, which meant breaking it down and rebuilding it, using rods, wheels, and pullies as variables in a complex ballistic equation. Eventually, many of the mechanical bits were replaced with electrical switches that moved levers here and there, reducing the time it took to break the machine down and rebuild it.These switches sparked something in Shannon, who recalled a class he took at the University of Michigan called Boolean Algebra. Turns out, this chap named George Boole figured out a way to apply mathematical principles to simple logical propositions. Shannon extended those ideas and figured out how the entire room-sized computer could be replaced with a series of electrical circuits.Claude Shannon invented the digital circuit.Encryption, Hashing and BlockchainThe story kind of wrote itself from that point on. Digital circuits led to digital computers which led to more efficient ways of calculating things which led to the need to transmit that information which led to the need to keep it a secret which led to where we are today.Scott and I dove into all of this.Cracking a simple asymmetric cipherWe explore SSH keys and how RSA works. We dive into hashes, discussing the goods and bads of each including how Rainbow Tables can be used to quickly and easily crack credit card information that hasn t been salted.Simplified doodle of SHA256We eventually end up with a discussion of cryptocurrency and blockchain, detailing why some people love it and others absolutely hate it. Both sides have some pretty good points It was a ton of fun writing this book made even moreso because I got to do it with a friend! I really hope you enjoy it!
The modern Droplet: How to choose the right VM for business and personal use
DigitalOcean Droplets are on-demand, Linux virtual machines suitable for production business applications and personal passion projects. We ve recently introduced Storage-Optimized Droplets with NVMe SSD, and have also made other adjustments to our Droplet portfolio.With these developments in mind, and with 2021 having arrived at last, we thought it would be a good time to provide up-to-date guidance regarding how to match your Droplet plan to your use case. You might also watch my talk from deploy, in which I speak to this and related topics:
How often does the Security team directly influence customer growth and user experience for the business? Unless it s for a security or privacy product or component, the answer is rarely.
Ryujinx is an Experimental Nintendo Switch Emulator written in C# for .NET Core
I love emulators. I love that they exist. I love that we have computers that are not only fast enough to translate and emulate instructions in real time for totally different computers that may not even exist any more but also for computers that are shipping today! I love these C# based emulators: CoreBoy is a cross platform GameBoy Emulator written in C# that even does ASCII Emulating a PlayStation 1 (PSX) entirely with C# and .NET Today I learned about Ryujinx, an experimental Nintendo Switch Emulator written in C# on .NET Core. The homepage is at https://ryujinx.org/. Emulators are great for learning about how to write and factor great code. Some are certainly "big ball of mud" architecture, but RyuJinx is VERY nice. Ryujix is particularly cleanly factored with individual projects and modules that really follow the single responsibility principal. It's written in .NET 5 and you can just git clone it, and go into the Ryujinx folder and "dotnet run," or build from Visual Studio. There are also daily builds on their site. Some of the impressive features - and again, this is written in C# on cross-platform open source .NET 5: The CPU emulator, ARMeilleure, emulates an ARMv8 CPU and currently has support for most 64-bit ARMv8 and some of the ARMv7 (and older) instructions, including partial 32-bit support. It translates the ARM code to a custom IR, performs a few optimizations, and turns that into x86 code. The GPU emulator emulates the Switch's Maxwell GPU using the OpenGL API (version 4.4 minimum) through a custom build of OpenTK. Xinput-compatible controllers are supported natively; other controllers can be supported with the help of Xinput wrappers such as x360ce. Most emulators are created for educational and experimental purposes, so don't look to be using this for nefarious purposes. This is a fantastic codebase to explore and experiment with. Using a computer is like riding in a Lyft. Writing an Emulator is like disassembling an internal combustion engine and putting it back together differently and it still works. It won't make you a better person but it will make you appreciate your Lyft. Sponsor: Simplify code, reduce costs, and build faster without compromising the transactionality, scale and flexibility you need. Fauna - a global serverless database for all your applications. Learn more! 2021 Scott Hanselman. All rights reserved.
My mom is very clever and thoughtful and when I was in my early teens and easily overwhelmed and generally freaking out or panicky she'd say, "feel it. Be here. What is your body telling you. Freak out fully but put a time limit on it." This idea of "time-boxed freak outs" has always stuck with me. A few times a year I get overwhelmed. I think we all do to some extent. Often I'd try to fight it, don't cry, don't get overwhelmed. But I remembered what my Mom said and I started being present in the freak out. I'd set a timer for 10 or 15 minutes and REALLY own it. Get upset, cry, and not feel bad about it. I deserve the release and by time-boxing it, it allowed me to own it and accept it. I can ramp up, and then ramp down. I've found this to be far more healthy than trying to swallow feelings and hold it in. Sometimes it needs to be OK to go and cry in your car in the parking lot. Disclaimer: I am not a doctor. I'm a random person and this is my random blog. This advice works for me and has worked for others, but know yourself and talk to a therapist if you are having uncontrollable panic attacks or feel unsafe. If this doesn't sound helpful, be present and be safe. I tweeted about this idea and found a number of replies that also found this technique helpful. Here are some anonymized quotes: time boxed panic I love it. Don t skip the feelings. You can t. You just defer them often to disastrous results. Sit with the discomfort a while. The way out is through. and Great advice. For some reason, we have been taught to suppress emotions, not to let things get to us and to not panic. And unfairly, men in particular have been encouraged not to show emotion. But it is a natural human response. Give yourself permission to feel & time box it. and This is a phenomenal idea. I would add, if you could add a few more minutes, take a walk away from whatever is stressing to clear your head. Sometimes being away from what is causing the stress can help as a reset. and finally You don't even have to cry or freak out. Just give yourself a time box to sit, stare, and clear your mind. No phones, no distractions. We have too much swirling in our heads. Again, as with all random internet advice, you are under no obligation to do anything you don't feel is safe for you. However, some have found this helpful. I also recorded a TikTok about it that is just 1 minute long: I hope it helps you. Be well! Sponsor: Simplify code, reduce costs, and build faster without compromising the transactionality, scale and flexibility you need. Fauna - a global serverless database for all your applications. Learn more! 2021 Scott Hanselman. All rights reserved.
Penny Pinching in the Cloud: Azure Static Web Apps are saving me money
I've long run a few dozen websites in Azure and while I've long noticed people (frankly) wasting money by having one Azure App Service (a Web Site) per Azure App Service Plan (a VM) I tend to pack them tight. A Basic 1 (B1) Azure App Service running Linux is around $13 a month but has nearly 2 gigs of RAM. Up that to about $26 a month and you've got 3.5 gigs of RAM and 2 Cores AND 10 gigs of storage. Use ALL that RAM. Max out that storage - use the resources you have paid for. If you hit up against a limit you can always add more and scale up. Run those boxes hot, you paid for them! While my blog and podcast and main site run on Azure Premium SKUs (and are fast and it's worth it) I have a dozen little one pagers, brochureware sites, and toys like https://www.keysleft.com/ and I've managed them all in an App Service as well. But they are static sites. They are nothing sites...so why do I need an App Service? It's overkill. Turns out Azure Static Web Apps are a lovely thing and they are FREE while in Preview. It's great for static sites, sites made with static site generators, or Jamstack sites with serverless functions behind them. So I converted a bunch of my little sites to Azure Static Web Apps. Took maybe 90 minutes to do 8 of them as seen below. Since the code for these sites was already in GitHub, it was very easy to move them. For example, the code for the KeysLeft site is at https://github.com/shanselman/keysleft and Azure Static Web Apps has a GitHub Action that easily deploys it on every commit. It's butter. It's created for you but you can see the generated GitHub Action as it lives alongside your code. The docs are clear and it works nicely with Vue, React, Angular, or just regular HTML like my son's Hamster Blog. https://www.myhamsterblog.com/ As it's in Preview now it's free, and I'm sure it'll be super cheap when it goes live. I have no idea how much it will cost but I'll worry about that later. For now it's allowed me to turn off an entire Azure App Service and replace it with Azure Static Web Apps. They also support custom domains and they automatically make and assign you an SSL cert. My only complaint is that there's no easy support (today) for apex domains (so all mine have www. as CNAMES) but you could proxy it through a free Cloud Flare account if you really want. Check it out, I suspect you have a site right now that's either generated or just static and this could save you some money. Sponsor: Protect your apps from reverse engineering and tampering with PreEmptive, the makers of Dotfuscator. Dotfuscator has been in-the-box with Microsoft Visual Studio since 2003. Visit preemptive.com/hanselminutes for a professional-grade trial. 2021 Scott Hanselman. All rights reserved.
I spend so much time at the command line using the Windows Terminal. Then I spend a ton of time using git at the command line. But then I ALT+TAB over to GitHub and mess around in the browser. Why have I been sleeping on the GitHub CLI? - there's a command line interface for GitHub! I installed with "winget install GitHub.cli" but you can get it from https://cli.github.com if you like. Then you run gh auth login once:gh auth login? What account do you want to log into? GitHub.com? What is your preferred protocol for Git operations? HTTPS? Authenticate Git with your GitHub credentials? Yes? How would you like to authenticate GitHub CLI? Login with a web browserNow you've got a new command "gh" to play with!I went over to one of my local git clones for the Hanselminutes Podcast website and I can now list the open Pull Requests from the command line!Here's the real time saver that Dan Wahlin reminded me about: gh repo create!> git initInitialized empty Git repository in D:/github/ghcliblogpost/.git/> gh repo create? Repository name ghcliblogpost? Repository description This is a test for my GH CLI Blog post? Visibility Public? This will add an "origin" git remote to your local repository. Continue? Yes Created repository shanselman/ghcliblogpost on GitHub Added remote https://github.com/shanselman/ghcliblogpost.gitFantastic! You can even gh issue create!gh issue createCreating issue in shanselman/hanselminutes-core? Title This is a test issue? Body <Received>? What's next? Submithttps://github.com/shanselman/hanselminutes-core/issues/219Checking out a Pull Request is a great time saver as well. Go check out http://cli.github.com/ and see how it can help you today!Sponsor: Protect your apps from reverse engineering and tampering with PreEmptive, the makers of Dotfuscator. Dotfuscator has been in-the-box with Microsoft Visual Studio since 2003. Visit preemptive.com/hanselminutes for a professional-grade trial. 2021 Scott Hanselman. All rights reserved.
Crowbits are Electronic Programmable LEGO Compatible Blocks for STEM Education
Late last year I blogged about the Elecrow CrowPi2 Raspberry Pi Laptop. The folks at Elecrow are great and I've used their original CrowPi many times with the kids at and talks. None of these links are affiliate links and I am getting no kickbacks from the company - I'm just a fan and own two of their products. As such I was excited to see their new Kickstarter called CrowBits. These are magnetic, programmable, electronic blocks that are also LEGO element compatible, which as you likely know, is a huge plus for my family. I've blogged a lot about STEM toys before, usually at Christmas, but this is a lovely spring surprise! The devices are ESP32, Arduino and Micro:bit compatible, and there's over 80 of them. 30 of them need no programming. The whole system has a Scratch 3.0 software sitting on top, so my kids and I are already familiar with how to program these. If you're not familiar, MIT's Scratch is a visual block language that abstracts away the text aspects of programming for visually nested blocks. It's very intuitive. Since the people at Elecrow have successfully delivered on all their previous KickStarters and I'm personally holding both CrowPis from those Kickstarters, I have high confidence in their ability to deliver the CrowBits. Even better, I'm seeing in the comments on the Kickstarter that the company is aiming to allow their programming system to run on the Raspberry Pi CrowPi devices I already own, so that's a bonus that it'll all work together. Go check it out https://www.kickstarter.com/projects/elecrow/crowbits-electronic-blocks-for-stem-education-at-any-level Sponsor: The No. 1 reason developers choose Couchbase? You can use your existing SQL++ skills to easily query and access JSON. That s more power and flexibility with less training. Learn more. 2021 Scott Hanselman. All rights reserved.
According to the Dapr open source website: "Dapr helps developers build event-driven, resilient distributed applications. Whether on-premises, in the cloud, or on an edge device, Dapr helps you tackle the challenges that come with building microservices and keeps your code platform agnostic." I've had Mark Russinovich on my podcast recently to talk about Dapr which is now at version 1.0. Dapr is platform agnostic, and you can use Dapr with your language of choice by leveraging an SDK or making simple HTTP or gRPC calls. Dapr is language agnostic and can run on any hosting environment including local development machines, Kubernetes, and public clouds such as AWS, Azure and GCP. The Dapr sidecar container collects traces so your application is instrumented with no additional code. Since a lot of folks who read my blog use .NET, I wanted to let you know there's a free eBook on how to use Dapr with .NET available now.You can download the free eBook "Dapr for .NET Developers" here now! It's available as a PDF and it's being actively improved so can offer feedback to the authors directly via GitHub issue. Congrats to Rob Vetter, Sander Molenkamp, and Edwin van Wijk and for their hard work on this book! This free book covers common needs for complex cloud apps and how to make it happen with Dapr and .NET, including: State management Service invocation Pub/sub Bindings Observability Secrets Dapr .NET SDK and more. Dapr enables developers using any language or framework to easily write microservices. It addresses many of the challenges found that come along with distributed applications, such as: How can distributed services discover each other and communicate synchronously? How can they implement asynchronous messaging? How can they maintain contextual information across a transaction? How can they become resilient to failure? How can they scale to meet fluctuating demand? How are they monitored and observed? There's also a project at the dotnet-architecture GitHub that includes a complete sample app (go give them a GitHub star, please, for their hard work!) that takes the eShopOnContainers project and instruments it with Dapr! eShopOnDapr runs in containers and requires Docker to run. There are various ways to start the application: Run eShopOnDapr from the CLI Run eShopOnDapr from Visual Studio (best F5 debugging experience) Run eShopOnDapr from Visual Studio Code (allows you to debug individual containers)) Run eShopOnDapr in Kubernetes Hope you enjoy it! The team would really hard on making it happen. Sponsor: Have what it takes to code securely? Select from dozens of complimentary interactive secure coding labs, test your skills, and earn a badge. Learn more! 2021 Scott Hanselman. All rights reserved.
DotNet Boxed includes prescriptive templates for .NET Core
This is pretty cool. As you may know, when you type "dotnet new" from the command line, or if you turn on the Visual Studio preview feature "Show all .NET Core templates in the New Project Dialog" that you see a bunch of potential starter templates. These are basic jumping off points for your next new project. Some folks feel there should be more included "out of the box." Enter "dotnet-boxed" templates! You can install them from the command line easily like this:dotnet new --install Boxed.TemplatesYou can confirm they are there by running dotnet new at the command line. The new "boxed" templates have a different tag:Templates Short Name Tags---------------------------------------------------------- ASP.NET Core API Boxed api .NET Boxed/Cloud/Service/WebASP.NET Core GraphQL Boxed graphql .NET Boxed/Cloud/Service/WebASP.NET Core Orleans Boxed orleans .NET Boxed/Cloud/Service/WebNuGet Package Boxed nuget .NET Boxed/LibraryLet's try them out! I can see them here in the File New Project dialog in VS2019: There's a really nice project that sets up a NuGet package right from File New! It can even set up Test, GitHub Actions, .editorconfig, license, cake build, code of conduct, and more. All that boring boilerplate is done for you!This is just one template example, there are also ones for WebAPIs, GraphQL projects, and Microsoft Orleans projects.DotNet-boxed is a great community supported project! Head over to GitHub now and give them a STAR and get involved! Even better, I see some "help wanted" issues on their GitHub. I'm sure they'd appreciate your help. https://github.com/Dotnet-Boxed/TemplatesSponsor: Tired of not finding the code you're looking for? Get Sourcegraph universal code search and search code across ALL your repos, languages, and code hosts. Find and fix code fast with Sourcegraph. Try it now! 2021 Scott Hanselman. All rights reserved.
Tiny top-level programs with C# 9 and SmallSharp and Visual Studio
One of the things I'm always working on and am always excited about is making C# simpler for new folks. With .NET 5, today, this works, as it includes C# 9> dotnet new consoleThe template "Console Application" was created successfully> echo System.Console.WriteLine("Hello World"); > program.cs> dotnet runHello WorldThat's C# 9 top level programs. We should be able to remove even more. Skip the first command completely and do it all on one line, one file. A hello world is eitherusing System;Console.WriteLine("Hello World!");orSystem.Console.WriteLine("Hello World!");or, scandalouslyusing System;using static System.Console;WriteLine("Hello World!");Not sure how I feel about that last one. Regardless...would this work in Visual Studio 2019? What if I was teaching a class and wanted to have one file per homework assignment, for example? Right now Visual Studio only supports one top-level program per project. Make sense why, but for learning, why not allow folks to choose from the run/debug menu?I'm going to add a reference to SmallSharp like this (or in Visual Studio)> dotnet add package smallsharpNow here's what my homework looks like in Visual Studio 2019! There's one menu item per top level program!This lovely prototype was done by Daniel Cazzulino (kzu) and you can learn more at https://github.com/devlooped/SmallSharp, or just try it out as I have here! What do you think? How can small top-level programs help new people?What about this?> dotnet new microservice> dotnet run Sound off in the comments. How tight and simple would that be?Sponsor: Tired of not finding the code you're looking for? Get Sourcegraph universal code search and search code across ALL your repos, languages, and code hosts. Find and fix code fast with Sourcegraph. Try it now! 2021 Scott Hanselman. All rights reserved.
How To Build a Corporate Website That Can Help Your Career as a Developer
Today s corporate websites have come a long way. Before, they used to be more like a business brochure with information limited to the company name, profile, and short descriptions of the products or services offered. That has changed drastically as years passed and the online space grew more and more competitive so much so that business […]The post How To Build a Corporate Website That Can Help Your Career as a Developer appeared first on Simple Programmer.
Do you have the passion and urge to become a successful programmer? Most people would say yes but hold a tint of doubt, as they are glued to certain stereotypes surrounding the world of programming. This doubt is primarily because of how programmers are portrayed in society. Unlike other professions, programmers have been portrayed in […]The post 10 Things You Don’t Need To Become a Programmer appeared first on Simple Programmer.
7 Soft Skills You Need for 2021 and How To Develop Them
You made it to 2021, congratulations. The carols and Christmas trees are well gone, and you re now face-to-face with your New Year s resolutions (because January didn t count) and the pandemic that s still out there. Perhaps one of the things you resolved to improve this year is your soft skills. It s February, how well are you […]The post 7 Soft Skills You Need for 2021 and How To Develop Them appeared first on Simple Programmer.
Can you imagine your life without a smartphone? We don t know if that is a good thing or a bad thing, but our life depends on that little gadget. We can pay the bills with a smartphone, hold meetings, and see friends and family members who are a thousand miles away. For better or worse, […]The post How Much Does It Cost To Build an App in 2021 appeared first on Simple Programmer.
I know a lot of you guys are starting out in your software developer programming careers, and you want to get up there you want to get up to the upper echelon. So today, I thought I would record a video on how you can become a good programmer a good one. Transcript Of The […]The post How to Become a GOOD Programmer appeared first on Simple Programmer.
What Is Business Process Modeling Notation (BPMN) and How It Can Benefit Your Next Programming Project?
As you might know, we can use modeling languages like UML (Unified Modeling Language) to specify complex software systems. However, there s a chance that you are still unaware of Business Process Model and Notation (BPMN). BPMN is a standard processing model that offers a graphical representation of business or software requirements in a Business Process […]The post What Is Business Process Modeling Notation (BPMN) and How It Can Benefit Your Next Programming Project? appeared first on Simple Programmer.
How To Keep Your Remote Team Safe From Cyber Attacks
With the world struggling to get back on track amid the COVID-19 emergency, telework has become the silver bullet that keeps businesses up and running. Numerous software engineering teams rushed headlong into adopting the remote workplace model as well, only to hit a bunch of roadblocks along the way. These hurdles run the gamut from […]The post How To Keep Your Remote Team Safe From Cyber Attacks appeared first on Simple Programmer.
6 Reasons Why It s Worth Becoming a Software Developer in 2021
Are you someone who loves tech and wants to make a living from it? Do you want a job that pays well and offers tons of opportunity for growth? If so, then you should consider becoming a software developer. Many people want to know if it s still worth learning how to code in 2021. There […]The post 6 Reasons Why It’s Worth Becoming a Software Developer in 2021 appeared first on Simple Programmer.
At first glance, programming seems to be all about technical ability. That s why many young developers underestimate the importance of soft skills and focus on developing their technical skills only. But ignoring soft skill development may lead to problems like a lack of effective communication with team members or clients, which can impact company operations […]The post A Beginner’s Guide to Soft Skills for Programmers appeared first on Simple Programmer.
IBM spent $34 Billion to acquire Red Hat, a renowned enterprise that builds open-source technologies. Another IT giant, Microsoft bought the code-repository GitHub for around $8 billion in stock. These acquisitions speak volumes for how the concept of open-source software has been embraced by the software development industry with open arms. Open-source software (OSS) has […]The post The Pros and Cons of Open-Source Software appeared first on Simple Programmer.
A quick look at history will show us how radical changes were often preceded by challenges. Throughout time, we can see that reforms in conventional systems were mostly tied to the occurrence of momentous events. In the case of remote work, the catalyst was COVID-19. Though work from home was not an alien concept in […]The post How the World of Remote Work Will Change in 2021 appeared first on Simple Programmer.
How To Improve Your Remote Software Engineers Productivity
Imagine the whole world as a pool of talent you could choose from to build your software engineering team. Sounds amazing, doesn t it? But you don t have to just imagine it this is the reality for thousands of companies of all industries and sizes who employ remote staff. Gone are the days when you had to […]The post How To Improve Your Remote Software Engineers Productivity appeared first on Simple Programmer.
User Onboarding: A Simple Guide To Craft a User Onboarding Flow
A few years ago, Chanty s team faced the following situation: We d done a good job with our product, an alternative to Slack and easy-to-use team chat, and we expected thousands of active users. Then we opened our analytics tool and discovered that we were failing to convert users the way we expected they were leaving. Months […]The post User Onboarding: A Simple Guide To Craft a User Onboarding Flow appeared first on Simple Programmer.
DO NOT Enroll in a CODING BOOTCAMP Until You Hear THIS
Today I thought I would talk about how to succeed in a programming boot camp. I've heard some good things about boot camps. But can you really learn to code in three months? Can you really get a job without a degree without a resume? Without that experience? Are you someone really going to hire […]The post DO NOT Enroll in a CODING BOOTCAMP Until You Hear THIS appeared first on Simple Programmer.
Top 7 Tools To Improve Your Web Development Workflow
Workflows are beneficial for every organization. They enable you to lay out all the activities that need to be done and ensure the effective and efficient implementation of your development plans. Workflows allow you to see the overall movement of your project and organization and help you retrace your steps, identify risks, and optimize your […]The post Top 7 Tools To Improve Your Web Development Workflow appeared first on Simple Programmer.
The resume is the document that gets you in the door at a new company. I can t stress how important it is to invest time and energy into this process, since your resume is the first thing that introduces you to potential employers. In this case, it should include enough useful information so that everyone […]The post 5 Practical Tips For Your Tech Resume appeared first on Simple Programmer.
A lot of you probably aspire to write a book, and you probably have heard a lot of different things about writing a book, whether it's profitable or not. Should you do self-publishing should you publish through a publisher? How do you even go about it? I'm gonna try to answer a bunch of those […]The post I Made Over $250,000 Selling My Programming Book appeared first on Simple Programmer.
How to Restart Your Developer Career After a Long Break
Software development has been one of the cream areas which has offered ample job opportunities to individuals having strong coding skills. Importantly, it is not just professionals having educational qualifications who are making strides in the industry; individuals with coding skills are tremendously successful. Does this sound like a trip down the memory lane? Was […]The post How to Restart Your Developer Career After a Long Break appeared first on Simple Programmer.
Everyone makes mistakes, even experienced DBAs (database administrators). And the interesting thing about the mistakes DBAs make is that they don t all relate to technology. As you will see, many of the top DBA mistakes happen due to immature policies and practices. Take these as an example: How much does downtime cost the company? When […]The post 10 SQL Server Mistakes DBAs Need to Avoid appeared first on Simple Programmer.
I am announcing today, the relaunch of my best-selling book soft skills Software Developers Life manual. This is the second edition. I'm going to tell you why you're going to want to pick up a copy today and a special bonus. I'm going to give you something that is going to be worth far more […]The post Soft Skills 2nd Edition OFFICIAL LAUNCH appeared first on Simple Programmer.
How To Never Run Out of Topics for Your Programming Blog
As your programming blog starts to grow, you might reach a point where ideas for content begin to dry up. That can happen when you ve been blogging for awhile. It can also occur at the moment you start your blog. It's usually a result of picking a niche that's too specific and hard to think […]The post How To Never Run Out of Topics for Your Programming Blog appeared first on Simple Programmer.
7 Reasons Why You Should Use Rust Programming For Your Next Project
The 2019 Stack Overflow survey has confirmed that Rust is the most loved programming language (preferred by a whopping 83.5% of programmers) for over four years now. This means that those who have taken the plunge and actually used Rust programming are in awe of it. However, Rust still isn t among the top five most […]The post 7 Reasons Why You Should Use Rust Programming For Your Next Project appeared first on Simple Programmer.
A Programmers Guide to Grow Your Personal Brand on Twitter
Personal branding is the latest buzz. Everyone is talking about building a personal brand and how it is helping them. Well, there is definitely truth to that. As a programmer, your personal brand is like a resume or portfolio that you are putting out in the online world. Anyone who wants to hire you, work […]The post A Programmers’ Guide to Grow Your Personal Brand on Twitter appeared first on Simple Programmer.
Today, I'm gonna do a real quick video today to tell you about how not to be scammed as a freelancer as a programmer because you might be doing some side jobs. It's a really dangerous situation in terms of wasting your time. I've been scammed myself, I've been not paid for things and it's […]The post How to not get SCAMMED as a FREELANCER Programmer appeared first on Simple Programmer.
Programming
how to make money as a programmer from home
A Programmer s Guide to Compliance Regulations
An important part of the planning phase of the software development life cycle is understanding what regulations will apply to your software. If you are an independent programmer looking to build your own startup, you need to understand these regulations so you can avoid heavy fines, criminal lawsuits, or a potential suspension of your business. […]The post A Programmer’s Guide to Compliance Regulations appeared first on Simple Programmer.
How To Build a Project and Then Use It To Land a Job
You ve probably heard that completing side projects helps you learn how to code. This is good advice to follow. Coding projects help you apply what you learned in the classroom and give you an opportunity to create something real and tangible. But there s more to it. Side projects aren t simply about driving the learning process […]The post How To Build a Project and Then Use It To Land a Job appeared first on Simple Programmer.
4 Reasons To Switch to Product Management and One Big Reason Not To
Product management is a common career change for developers. I did it and loved it. The primary responsibility of a product manager is determining the strategy and scope of the product. You ll focus on answering questions like: What problems does the product solve? Who for? How does it solve them? Product management works closely with […]The post 4 Reasons To Switch to Product Management and One Big Reason Not To appeared first on Simple Programmer.
The rewards of being a software programmer can turn into a pyrrhic victory if issues such as untackled job burnout, fostering of imposter syndrome, and working late hours are left unchecked. Greg Baugues, a developer who has struggled with bipolar disorder, explains in his speech Developers and Depression how depression specifically impacts the developer community […]The post 7 Tips to Stay Healthy as a Software Developer appeared first on Simple Programmer.
I know a lot of you guys are in the junior developer role and you want to figure out how you can succeed how you can climb the little corporate ladder and get your big paycheck at that senior title. I did this, I help a lot of people do this. It's not that hard. […]The post Succeeding as a Junior Developer appeared first on Simple Programmer.
Welcome to our 2020 Gifts for Programmers Guide! Unlike our other articles, this one is not just for our Simple Programmers. It is for their friends, family, and loved ones who have a hard time choosing the perfect present for the tech enthusiasts in their life.Have a programmer who you have to find the perfect gift […]The post 2020 Gifts for Programmers Guide appeared first on Simple Programmer.
The job market for programmers has arguably never been more competitive. While the demand for programmers is high, remote work and access to a global workforce has provided companies with a larger pool of potential employees while at the same time driving down labor costs.. This means that if you want to land your dream […]The post How LinkedIn Can Help You Land a Programming Job appeared first on Simple Programmer.
It is not a secret that IT professionals, in general, are not necessarily known for their communication and social skills. But this is not limited to web developers, designers, coders, and the like it is common for professionals across different industries. And often, this challenge is highlighted during a person s job search. Most IT professionals would […]The post How To Tell if an Interview Went Well appeared first on Simple Programmer.
The Complete Guide to Cybersecurity for New Programmers
When starting out as a programmer, you re way too busy to pay attention to anything unrelated to your core competencies. You have your hands full with learning new technologies and languages, and executing projects. Though it may be tough to squeeze in, it s important that you find the time to learn cybersecurity. Code that is […]The post The Complete Guide to Cybersecurity for New Programmers appeared first on Simple Programmer.
The global network of smart and connected objects (internet of things or IoT) has changed every industry. These smart devices have contributed to what we commonly refer to as the Fourth Industrial Revolution or Industry 4.0. It s tough to imagine life without these connected devices as they make life easier and give us greater control […]The post The Essential Guide to Becoming an IoT Developer appeared first on Simple Programmer.
Are you tired of boring 9-5 office jobs and looking for a little more freedom? Maybe there s something about going into the office every day that has you looking for alternative, more remote programming career options. Whatever your reasoning may be, starting a career in freelance programming can be the answer you re looking for. Freelancing […]The post How To Start a Freelance Programming Career appeared first on Simple Programmer.
How To Build a Chatbot: 4 Components of Creating a Bot
As a programmer, I wanted to learn how to develop a chatbot by creating my own. I focused on the following components of developing a bot: the Natural Language Understanding (NLU) component, the Dialog Manager (DM), the Natural Language Generation (NLG) component, and the Modularized, or end-to-end, approach. To build a successful chatbot, consider the […]The post How To Build a Chatbot: 4 Components of Creating a Bot appeared first on Simple Programmer.
The software development life cycle is a set of steps that help developers plan and create software in an organized way. It has six mains steps: Each step comes with it s own security issues that need to be addressed. Failure to address these issues can result in an application full of bugs that, once exploited, […]The post A Complete Developer s Guide to Securing the SDLC appeared first on Simple Programmer.
Today I'm going to talk about getting some passive income as a programmer. I happen to be an expert in this because that's what I did is I built a lot of passive income as a programmer. #passiveincomeasadeveloper #passiveincomeideas #waystomakepassiveincome Today I'm going to talk about getting some passive income as a programmer. I happen […]The post How to Make Passive Income as a Programmer appeared first on Simple Programmer.
With over 2.8 million apps in Google Play Store and 1.8 million apps available in Apple s App Store as of March 2020, the app market is growing at an exponential rate.Smartphone users are using mobile apps for different professional, personal, and entertainment purposes. But with so many apps being published every day, is your app […]The post How To Acquire Users For Your App appeared first on Simple Programmer.
16 Free Tools & Services for Developers: Better and Higher Productivity
Are you looking for killer free tools and ammunition to make your work efficient? Let s indulge you with some personally tried and tested tools that will boost your productivity and help you track your daily project tasks, collaborate, and seamlessly communicate with your team, all that from free tools. In this post, I have listed […]The post 16 Free Tools & Services for Developers: Better and Higher Productivity appeared first on Simple Programmer.
I think a lot of you are job-hopping. That's fine. It's cool. I get it, but you're not leaving the best trail behind you when you're leaving your job. So in this video, I'm gonna tell you how to quit your job in such a way that you don't screw yourself over. Transcript Of The […]The post How to LEAVE Your Programming Job appeared first on Simple Programmer.
Learning to program can be brutal. You never know if you re learning the right things, and it s easy to become overwhelmed by how much content there is to learn. That brings up a good thought: How do you know when you ve learned enough to start applying for jobs? Chances are, you re concerned with how long […]The post 4 Tips for Learning How to Code Faster appeared first on Simple Programmer.
If you struggle to keep up with the demands of software development, your diet may be to blame. Your brain relies on food for fuel for maximum performance, but not all fuel is made the same. Some foods are high in protein. Meanwhile, other foods are high in sodium and processed sugar. Reflect on the […]The post How Microbiomes Can Make You a Better Developer appeared first on Simple Programmer.
How to Pass AWS Certified Architect Associate Exam
Failure doesn t mean the game is over, it means try again with experience. Len Schlesinger I recently passed my Amazon Web Services (AWS) Certified Architect Associate Exam (woohoo!). It was not easy and took a lot of work, but it was all worth it. This certification can help you start a career in cloud […]The post How to Pass AWS Certified Architect Associate Exam appeared first on Simple Programmer.
What s Load Testing and How Does a Locust Framework Help?
There are many tools leveraged for software testing, QA testing, and performance testing on an existing server. You would think that with all these tools at your disposal, these forms of testing would be easy to manage. However, benchmark testing to ascertain a server's performance under different load conditions can be challenging. It's often a […]The post What s Load Testing and How Does a Locust Framework Help? appeared first on Simple Programmer.
Choosing A Gamification LMS: Features To Look Out For
Engaging learners, be it in a K-12 scenario or employee training, is a known challenge in development and learning circles. Over the years, a lot of research has been conducted with the hope of finding out more about how the human brain learns. Thanks to this research, combined with advancements in technology, we now have […]The post Choosing A Gamification LMS: Features To Look Out For appeared first on Simple Programmer.
How to Not Be Awkward In Conversations AS A PROGRAMMER
I know a lot of you guys are shy, you might consider yourself an introvert. I don't really believe in the term, to be honest with you. So in this video, we're going to talk about confidence and how to just be confident in a conversation. #howtonotbesociallyawkward #howtostopbeingshyandawkward #howtoholdaconversation Transcript Of The Video John Sonmez: […]The post How to Not Be Awkward In Conversations AS A PROGRAMMER appeared first on Simple Programmer.
Different people deal with challenging situations differently, and the way a crisis affects the individual psyche is unique to everyone. Some people withdraw into themselves when they feel threatened: They focus on their own personal lives and obsess over what they can control in a time that generally feels out of control. Others have a […]The post How To Keep Going When You Have No Motivation appeared first on Simple Programmer.
A Programmer s Guide to Crowdsourcing Security with Bug Bounties
Developing secure software is a key element to modern software development. Around 75% of programmers worry about security in their applications and 85% rank security as very important in the coding and development process. This has bred a huge emphasis on secure by design, which is the concept of creating software with security already built […]The post A Programmer s Guide to Crowdsourcing Security with Bug Bounties appeared first on Simple Programmer.
Happy Birthday, Backstage: Spotify s Biggest Open Source Project Grows Up Fast
TLDR: As Backstage turns one, we re doubling down on our commitment to the open source project and the community we re building it with.  From Hack Week hunch to CNCF Sandbox Last year, a small team of Spotifiers had a hunch about our homegrown developer portal: if Backstage could help our 1,600+ engineers manage the 14,000+ […]
In the second edition of our Level Up series, we explore how to create shapes, animations, and art using p5.js.The post Level Up: Creative coding with p5.js – part 1 appeared first on Stack Overflow Blog.
In this series, we look at the most loved languages according to the Stack Overflow developer survey, the spread and use cases for each of them and collect some essential links on how to get into them. First up: Rust.The post Getting started with … Rust appeared first on Stack Overflow Blog.
Rather than dig into complex math or over-simplify by using a pre-written function, we'll write our own binomial test function, primarily using base Python. In the process, we'll learn more about how hypothesis testing works and build intuition for how to interpret a p-value.The post Level Up: Mastering statistics with Python – part 5 appeared first on Stack Overflow Blog.
Comparing summary statistics like the mean and median can help us understand how these variables are related, but we can learn even more by using visualizations.The post Level Up: Mastering statistics with Python – part 3 appeared first on Stack Overflow Blog.
The Overflow #61: I followed my dreams and got demoted to software developer
Welcome to ISSUE #61 of the Overflow! This newsletter is by developers, for developers, written and curated by the Stack Overflow team and Cassidy Williams at Netlify. This week, understanding communication cultures, the lack of inconsistencies in Newton s universe, and understanding how CPUs handle negative numbers. From the blog Why are video calls so tiring? You might be The post The Overflow #61: I followed my dreams and got demoted to software developer appeared first on Stack Overflow Blog.
Podcast 315: How to use interference to your advantage a quantum computing catch up
The power of quantum computing has implications for security, cryptocurrency, and drug discovery.The post Podcast 315: How to use interference to your advantage – a quantum computing catch up appeared first on Stack Overflow Blog.
Podcast 314: How do digital nomads pay their taxes?
Many tech companies are shifting to make remote work the norm. Employees get a lot of freedom, but there are a few wrinkles to consider.The post Podcast 314: How do digital nomads pay their taxes? appeared first on Stack Overflow Blog.